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PREFACE 


A glance at the table of contents will reveal that this textbook treats topics in 
analysis at the “Advanced Calculus” level. The aim has been to provide a develop- 
ment of the subject which is honest, rigorous, up to date, and, at the same time, 
not too pedantic. The book provides a transition from elementary calculus to 
advanced courses in real and complex function theory, and it introduces the reader 
to some of the abstract thinking that pervades modem analysis. 

The second edition dilfers from the first in many respects. Point set topology 
is developed in the setting of general metric spaces as well as in Euclidean »-space, 
and two new chapters have been added on Lebesgue integration. The material on 
line integrals, vector analysis, and surface integrals has been deleted. The order of 
some chapters has been rearranged, many sections have been completely rewritten, 
and several new exercises have been added. 

The development of Lebesgue integration follows the Riesz-Nagy approach 
which focuses directly on functions and their integrals and does not depend on 
measure theory. The treatment here is simplified, spread out, and somewhat 
rearranged for presentation at the undergraduate level. 

The first edition has been used in mathematics courses at a variety of levels, 
from first-year undergraduate to first-year graduate, both as a text and as supple- 
mentary reference. The second edition preserves this flexibility. For example, 
Chapters 1 through 5, 12, and 13 provide a course in differential calculus of func- 
tions of one or more variables. Chapters 6 through 1 1, 14, and 15 provide a course 
in integration theory. Many other combinations are possible ; individual instructors 
can choose topics to suit their needs by consulting the diagram on the next page, 
which displays the logical interdependence of the chapters. 

I would like to express my gratitude to the many people who have taken the 
trouble to write me about the first edition. Their comments and suggestions 
influenced the preparation of the second edition. Special thanks are due Dr. 
Charalambos Aliprantis who carefully read the entire manuscript and made 
numerous helpful suggestions. He also provided some of the new exercises. 
Finally, I would like to acknowledge my debt to the undergraduate students of 
Caltech whose enthusiasm for mathematics provided the ori ginal incentive for this 
work. 


Pasadena 
September 1973 


T.M.A. 
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CHAPTER 1 


THE REAL AND 
COMPLEX NUMBER SYSTEMS 


1.1 INTRODUCTION 

Mathematical analysis studies concepts related in some way to real numbers, so 
we begin our study of analysis with a discussion of the real-number system. 

Several methods are used to introduce real numbers. One method starts with 
the positive integers 1, 2, 3, . . . as undefined concepts and uses them to build a 
larger system, the positive rational numbers (quotients of positive integers), their 
negatives, and zero. The rational numbers, in turn, are then used to construct the 

irrational numbers , real numbers like \fl and n which are not rational. The rational 
and irrational numbers together constitute the real-number system. 

Although these matters are an important part of the foundations of math- 
ematics, they will not be described in detail here. As a matter of fact, in most 
phases of analysis it is only the properties of real numbers that concern us, rather 
than the methods used to construct them. Therefore, we shall take the real numbers 
themselves as undefined objects satisfying certain axioms from which further 
properties will be derived. Since the reader is probably familiar with most of the 
properties of real numbers discussed in the next few pages, the presentation will 
be rather brief. Its purpose is to review the important features and persuade the 
reader that, if it were necessary to do so, all the properties could be traced back 
to the axioms. More detailed treatments can be found in the references at the end 
of this chapter. 

For convenience we use some elementary set notation and terminology. Let 
S denote a set (a collection of objects). The notation x e S means that the object x 
is in the set S, and we write x $ S to indicate that x is not in S. 

A set 5 is said to be a subset of T, and we write S' £ T, if every object in S is 
also in T. A set is called nonempty if it contains at least one ^object. 

We assume there exists a nonempty set R of objects, called real numbers, 
which satisfy the ten axioms listed below. The axioms fall in a natural way into 
three groups which we refer to as the field axioms, the order axioms, and the 
completeness axiom (also called the least-upper-bound axiom or the axiom of 
continuity ). 

1.2 THE FIELD AXIOMS 

Along with the -set R of real numbers we assume the existence of two operations, 
called addition and multiplication, such that for every pair of real numbers x and y 
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Ax. 1 


the sum x + y and the product xy are real numbers uniquely determined by x 
and y satisfying the following axioms. (In the axioms that appear below, x, y, 
z represent arbitrary real numbers unless something is said to the contrary.) 

Axiom 1 . x + y = y + x, xy = yx ( commutative laws). 

Axiom 2. x + (y + z) = (x + y) + z, x{yz) = (xy)z ( associative laws). 

Axiom 3. x(y + z) = xy + xz {distributive law). 

Axiom 4. Given any two real numbers x and y, there exists a real number z such that 
x + z = y. This z is denoted by y — x; the number x — x is denoted by 0. {It 
can be proved that 0 is independent of x.) We write —x for 0 — x and call —x the 
negative of x. 

Axiom 5. There exists at least one real number x # 0. If x and y are two real 
numbers with x # 0, then there exists a real number z such that xz = y. This z is 
denoted by yjx; the number xjx is denoted by 1 and can be shown to be independent of 
x. We write x -1 for 1 jx if x ^ 0 and call x~ i the reciprocal of x. 

From these axioms all the usual laws of arithmetic can be derived ; for example, 
— (— x) = x, (x -1 ) -1 = x, — (x — y) = y — x, x — y = x + (— y), etc. (For 
a more detailed explanation, see Reference 1.1.) 


1.3 THE ORDER AXIOMS 

We also assume the existence of a relation < which establishes an ordering among 
the real numbers and which satisfies the following axioms : 

Axiom 6. Exactly one of the relations x = y, x<y, x>y holds. 

note, x > y means the same as y < x. 

Axiom 7. If x < y, then for every z we have x + z < y + z. 

Axiom 8. If x > 0 and y > 0, then xy > 0. 

Axiom 9. If x > y and y > z, then x > z. 

note. A real number x is called positive if x > 0, and negative if x < 0. We 

denote by R + the set of all positive real numbers, and by R~ the set of all negative 
real numbers. 

From these axioms we can derive the usual rules for operating with inequalities. 
For example, if we have x < y, then xz < yz if z is positive, whereas xz > yz if 
z is negative. Also, if x > y and z > w where both y and w are positive, then 
xz > yw. (For a complete discussion of these rules see Reference 1.1.) 

note. The symbolism x < y is used as an abbreviation for the statement : 

“x < y or x = y.” 
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Thus we have 2 <, 3 since 2 < 3; and 2 < 2 since 2 = 2. The symbol > is 
similarly used. A real number x is called nonnegative if x > 0. A pair of simul- 
taneous inequalities such as x < y, y < z is usually written more briefly as 
x < y < z. 

The following theorem, which is a simple consequence of the foregoing axioms, 
is often used in proofs in analysis. 

Theorem 1.1. Given real numbers a and b such that 


Then a <, b. 
Proof. If b < 


a < b + e for every e > 0. 


a, then inequality (1) is violated for e = (a — b)l 2 because 

, , , , a — b a + b a + a 

o + e = o + = < = a. 



Therefore, by Axiom 6 we must have a < b. 

Axiom 10, the completeness axiom, will be described in Section 1.11. 


1.4 GEOMETRIC REPRESENTATION OF REAL NUMBERS 

The real numbers are often represented geometrically as points on a line (called 
the real line or the real axis'). A point is selected to represent 0 and another to 
represent 1, as shown in Fig. 1.1. This choice determines the scale. Under an 
appropriate set of axioms for Euclidean geometry, each point on the real line 
corresponds to one and only one real number and, conversely, each real number 
is represented by one and only one point on the line. It is customary to refer to 
the point x rather than the point representing the real number x. 

— I 1 • • Figure 1.1 

0 1 x y 

The order relation has a simple geometric interpretation. If x < y, the point 
x lies to the left of the point y, as shown in Fig. 1.1. Positive numbers lie to the 
right of 0, and negative numbers to the left of 0. If a < b, a point x satisfies the 
inequalities a < x < b if and only if x is between a and b. 


1.5 INTERVALS 

The set of all points between a and b is called an interval. Sometimes it is important 
to distinguish between intervals which include their endpoints and intervals which 
do not. 

notation. The notation {x : x satisfies /*} will be used to designate the set of 
all real numbers x which satisfy property P. 
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Definition 1.2 . Assume a < b. The open interval (a, b) is defined to be the set 

(< a , b) = {x : a < x < b }. 

The closed interval [a, ti] is the set {x : a < x < b}. The half-open intervals 
(a, b ] and [a, b ) are similarly defined , using the inequalities a < x < b and 
a < x < b, respectively. Infinite intervals are defined as follows: 

(a, + oo) = {x : x > a), [a, + oo) = {x : x > a}, 

(— oo, a) = {x : x < a }, (— oo, a] = {* : x < a}. 

The real line R is sometimes referred to as the open interval (— oo, + oo). A 
single point is considered as a “degenerate” closed interval. 

note. The symbols + oo and — oo are used here purely for convenience in notation 
and are not to be considered as being real numbers. Later we shall extend the 
real-number system to include these two symbols, but until this is done, the reader 
should understand that all real numbers are “finite.” 

1.6 INTEGERS 

This section describes the integers , a special subset of R. Before we define the 

integers it is convenient to introduce first the notion of an inductive set. 

Definition 1.3. A set of real numbers is called an inductive set if it has the following 
two properties : 

a) The number 1 is in the set. 

b) For every x in the set, the number x + 1 is also in the set. 

For example, R is an inductive set. So is the set R + . Now we shall define the 
positive integers to be those real numbers which belong to every inductive set. 

Definition 1.4. A real number is called a positive integer if it belongs to every 
inductive set. The set of positive integers is denoted by Z + . 

The set Z + is itself an inductive set. It contains the number 1, the number 
1 + 1 (denoted by 2), the number 2+1 (denoted by 3), and so on. Since Z + is a 
subset of every inductive set, we refer to Z + as the smallest inductive set. This 
property of Z + is sometimes called the principle of induction . We assume the 
reader is familiar with proofs by induction which are based on this principle. 
(See Reference 1.1.) Examples of such proofs are given in the next section. 

The negatiyes of the positive integers are called the negative integers . The 
positive integers, together with the negative integers and 0 (zero), form a set Z 
which we call simply the set of integers. 

1.7 THE UNIQUE FACTORIZATION THEOREM FOR INTEGERS 

/ 

If n and d are integers and if n = cd for some integer c, we say d is a divisor of n, 
or n is a multiple of d, and we write d\n (read: d divides n). An integer n is called 
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a prime if n > 1 and if the only positive divisors of n are 1 and n. If n > 1 and n 
is not prime, then n is called composite. The integer 1 is neither prime nor composite. 

This section derives some elementary results on factorization of integers, 
culminating in the unique factorization theorem, also called the fundamental theorem 
of arithmetic. 

The fundamental theorem states that (1) every integer n > 1 can be represented 
as a product of prime factors, and (2) this factorization can be done in only one 
way, apart from the order of the factors. It is easy to prove part (1). 

Theorem 1.5. Every integer n > 1 is either a prime or a product of primes. 

Proof. We use induction on n. The theorem holds trivially for n = 2. Assume 
it is true for every integer k with 1 < k < n. If n is not prime it has a positive 
divisor d with 1 < d < n. Hence n = cd, where 1 < c < n. Since both c and 
d are <n, each is a prime or a product of primes ; hence n is a product of primes. 

Before proving part (2), uniqueness of the factorization, we introduce some 
further concepts. 

If d\a and d\b we say d is a common divisor of a and b. The next theorem 
shows that every pair of integers a and b has a common divisor which is a linear 
combination of a and b. 

Theorem 1.6. Every pair of integers a and b has a common divisor d of the form 

d = ax + by 

where x and y are integers. Moreover, every common divisor of a and b divides 
this d. 

Proof. First assume that a > 0, b > 0 and use induction on n = a + b. If 
» = 0 then a - b = 0, and we can take d = 0 with x = y = 0. Assume, then, 
that the theorem has been proved for 0, 1, 2, . . . , n — 1. By symmetry, we can 
assume a ^ b. If b = 0 take d = a, x = 1, y = 0. If b > 1 we can apply the 
induction hypothesis to a — b and b, since their sum is a = n — b < n — 1. 
Hence there is a common divisor d of a - b and b of the form d = (a - b)x + by. 
This d also divides (a — b) + b = a, so d is a common divisor of a and b and 
we have d = ax + (y — x)b, a linear combination of a and b. To complete the 
proof we need to show that every common divisor divides d. Since a common 
divisor divides a and b, it also divides the linear combination ax + (y — x)b = d. 
This completes the proof if a ^ 0 and b > 0. If one or both of a and b is negative^ 
apply the result just proved to |a| and |6|. 

note. If d is a common divisor of a and b of the form d = ax + by, then -d is 
also a divisor of the same form, -d = a(-x) + b(-y). Of these two common 
divisors, the nonnegative one is called the greatest common divisor of a and b, 
and is denoted by gcd(a, b) or, simply by (a, b). If (a, b) = 1 then a and b are 
said to be relatively prime. 

Theorem 1.7 (Euclid's Lemma). If a\bc and (a, b) = 1, then a\c. 
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Proof. Since (a, b) = 1 we can write 1 = ax + by. Therefore c = acx + bey. 
But a\acx and a\bcy, so a|c. 

Theorem 1.8. If a prime p divides ab, then p\a or p\b. More generally, if a prime p, 
divides a product a 1 ■ • • a k , then p divides at least one of the factors. 

Proof. Assume p\ab and that p does not divide a. If we prove that (p, a) = 1, 
then Euclid’s Lemma implies p\b. Let d = (p, a). Then d\p so d = 1 or d = p. 
We cannot have d = p because d\a but p does not divide a. Hence d = 1. To 
prove the more general statement we use induction on k, the number of factors. 
Details are left to the reader. 

Theorem 1.9 (Unique factorization theorem). Every integer n > 1 can be repre- 
sented as a product of prime factors in only one way, apart from the order of the 
factors. 

Proof. We use induction on n. The theorem is true for n = 2. Assume, then, 
that it is true for all integers greater than 1 and less than n. If n is prime there is 
nothing more to prove. Therefore assume that n is composite and that n has two 
factorizations into prime factors, say 

» = P\Pi ••■Ps = 0102 •••?«• (2) 

We wish to show that s = t and that each p equals some q. Since p x divides the 
product ' 0 „ it divides at least one factor. Relabel the 0’s if necessary so 

that />i|0!. Then p t = q x since both p x and q x are primes. In (2) we cancel p x 
on both sides to obtain 

— = P% ‘ • • Ps = 92 • • Qv 
Pi 

Since n is composite, 1 < n/p x < n; so by the induction hypothesis the two 
factorizations of njp x are identical, apart from the order of the factors. Therefore 
the same is true in (2) and the proof is complete. 


1.8 RATIONAL NUMBERS 

Quotients of integers a/b (where b # 0) are called rational numbers. For example, 
1/2, —7/5, and 6 are rational numbers. The set of rational numbers, which we 
denote by Q, contains Z as a subset. The reader should note that all the field 
axioms and the order axioms are satisfied by Q. 

We assume that the reader is familiar with certain elementary properties of 
rational numbers. For example, if a and b are rational, their average ( a + b)f 2 is 
also rational and lies between a and b. Therefore between any two rational numbers 
there are infinitely many rational numbers, which implies that if we are given a 
certain rational number we cannot speak of the “next largest” rational number. 
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1.9 IRRATIONAL NUMBERS 

Real numbers that are not rational are called irrational. For example, the numbers 
v 2, e, n and e K are irrational. 

Ordinarily it is not too easy to prove that some particular number is irratio nal 
There is no simple proof, for example, of the irrationality of e*. However, the 

irrationality of certain numbers such as \!l and V3 is not too difficult to establish 
and, in fact, we easily prove the following : 

Theorem 1.10. If n is a positive integer which is not a perfect square, then \fn is 
irrational. 

Proof. Suppose first that n contains no square factor >1. We assume that \fn is 

rational and obtain a contradiction. Let Vn = a/b, where a and b are integers 
having no factor in common. Then nb 2 = a 2 and, since the left side of this equation 
is a multiple of n, so too is a 2 . However, if a 2 is a multiple of n, a itself must be a 
multiple of n, since n has no square factors > 1. (This is easily seen by examining 
the factorization of a into its prime factors.) This means that a — cn, where c is 
some integer. Then the equation nb 2 = a 2 becomes nb 2 = c 2 n 2 , or b 2 = nc 2 . 
The same argument shows that b must also be a multiple of n. Thus a and b are 
both multiples of n, which contradicts the fact that they have no factor in common. 
This completes the proof if n has no square factor > 1. 

If n has a square factor, we can write n = m 2 k, where k > 1 and k has no 

square factor >1. Then V« = my/k; and if Vn were rational, the number yfk 
would also be rational, contradicting that which was just proved. 

A different type of argument is needed to prove that the number e is irrational. 
(We assume familiarity with the exponential e* from elementary calculus and its 
representation as an infinite series.) 

Theorem 1.11. If e* = l + x + x 2 jl\ + x 3 /3! + • • • + xT/nl + • • • , then the 
number e is irrational. 


Proof. We shall prove that e~ l is irrational. The series for e ~ 1 is an alternating 
series with terms which decrease steadily in absolute value. In such an alterna ting 
series the error made by stopping at the nth term has the algebraic sign of the first 
neglected term and is less in absolute value than the first neglected term. Hence, 
if s„ = Sk=o (— l) k /k\, we have the inequality 


from which we obtain 


0 < e 1 


S 2k-l 



0 < (2 k- 1)! (e -1 




for any integer -A: ^ 1. Now (2k — 1)! s 2k ~i is always an integer. If e 1 were 
rational, then we could choose k so large that (2k - l)\e~ l would also be an 
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integer. Because of (3) the difference of these two integers would be a number 
between 0 and i, which is impossible. Thus e~ l cannot be rational, and hence e 
cannot be rational. 

note. For a proof that n is irrational, see Exercise 7.33. 

The ancient Greeks were aware of the existence of irrational numbers as early 
as 500 b.c. However, a satisfactory theory of such numbers was not developed 
until late in the nineteenth century, at which time three different theories were 
introduced by Cantor, Dedekind, and Weierstrass. For an account of the theories 
of Dedekind and Cantor and their equivalence, see Reference 1 .6. 

1.10 UPPER BOUNDS, MAXIMUM ELEMENT, LEAST UPPER BOUND 
(SUPREMUM) 

Irrational numbers arise in algebra when we try to solve certain quadratic equa- 
tions. For example, it is desirable to have a real number x such that x 2 = 2. From 
the nine axioms listed above we cannot prove that such an x exists in R because 
these nine axioms are also satisfied by Q and we have shown that there is no 
rational number whose square is 2. The completeness axiom allows us to introduce 
irrational numbers in the real-number system, and it gives the real-number system 
a property of continuity that is fundamental to many theorems in analysis. 

Before we describe the completeness axiom, it is convenient to introduce 
additional terminology and notation. 

Definition 1.12. Let S be a set of real numbers. If there is a real number b such 
that x < b for every x in S, then b is called an upper bound for S and we say that 
S is bounded above by b. 

We say an upper bound because every number greater than b will also be an 
upper bound. If an upper bound b is also a member of S, then b is called the 
largest member or the maximum element of S. There can be at most one such b. 
If it exists, we write 

b = max S. 

A set with no upper bound is said to be unbounded above. 

Definitions of the terms lower bound , bounded below, smallest member (or 
minimum element) can be similarly formulated. If S has a minimum element we 
denote it by min S. 

Examples 

1. The set R + = (0, + oo) is unbounded above. It has no upper bounds and no max- 
imum element. It is bounded below by 0 but has no minimum element. 

2. The closed interval S = [0, 1 ] is bounded above by 1 and is bounded below by 0. 
In fact, max S = 1 and min 5 = 0. 

3. The half-open interval S = [0, 1) is bounded above by 1 but it has no maximum 
element. Its minimum element is 0. 
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For sets like the one in Example 3, which are bounded above but have no 
maximum element, there is a concept which takes the place of the maximum ele- 
ment. It is called the least upper bound or supremum of the set and is defined as 
follows: 

Definition 1.13. Let S be a set of real numbers bounded above. A real number b is 
called a least upper bound for S if it has the following two properties: 

a) b is an upper bound for S. 

b) No number less than b is an upper bound for S. 

\ 

Examples. If S = [0, 1 ] the maximum element 1 is also a least upper bound for S. If 
S = [0, 1) the number 1 is a least upper bound for S, even though S has no maximum 
dement. 

It is an easy exercise to prove that a set cannot have two different least upper 
bounds. Therefore, if there is a least upper bound for S, there is only one and we 
can speak of the least upper bound. 

It is common practice to refer to the least upper bound of a set by the more 
concise term supremum, abbreviated sup. We shall adopt this convention and write 

b = sup S 

to indicate that b is the supremum of S. If S has a maximum element, then 
max S = sup S. 

The greatest lower bound, or infimum of S, denoted by inf S, is defined in an 
analogous fashion. 

1.11 THE COMPLETENESS AXIOM 

Our final axiom for the real number system involves the notion of supremum. 

Axiom 10. Every nonempty set S of real numbers which is bounded above has a 
supremum; that is, there is a real number b such that b = sup S. 

As a consequence of this axiom it follows that every nonempty set of real 
numbers which is bounded below has an infimum . 

1.12 SOME PROPERTIES OF THE SUPREMUM 

This section discusses some fundamental properties of the supremum that will be 
useful in this text. There is a corresponding set of properties of the infimum that 
the reader should formulate for himself. 

The first property shows that a set with a supremum contains numbers arbi- 
trarily close to its supremum. 

Theorem 1.14 (Approximation property). Let S be a nonempty set of real numbers 
with a supremum, say b = sup S. Then for every a < b there is some x in S such 
that 


a < x £ b. 
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Proof. First of all, x < biox all x in S. If we had x < a for every x in S, then a 
would be an upper bound for S smaller than the least upper bound. Therefore 
x > a for at least one x in S. 

Theorem 1.15 (Additive property). Given nonempty subsets A and B of R, let C 
denote the set 

C = {x + y : x e A, ye B}. 

If each of A and B has a supremum, then C has a supremum and 

sup C = sup A + sup B. 

Proof Let a = sup A, b = sup B. If z e C then z = x + y, where xe A, 
y e B, soz = x + y< s a + b. Hence a + b is an upper bound for C, so C has a 
supremum, say c = sup C, and c < a + b. We show next that a + b <, c. 
Choose any e > 0. By Theorem 1.14 there is an x in A and a y in B such that 

a — e < x and b — s < y. 

Adding these inequalities we find 

a + b — 2e < x + y < c. 

Thus, a + b < c + 2e for every e > 0 so, by Theorem 1.1, a + b < c. 

The proof of the next theorem is left as an exercise for the reader. 

Theorem 1.16 ( Comparison property). Given nonempty subsets S and T of R such 
that s <, t for every s in S and t in T. IfT has a supremum then S has a supremum 
and 

sup S < sup T. 

1.13 PROPERTIES OF THE INTEGERS DEDUCED FROM THE 
COMPLETENESS AXIOM 

Theorem 1.17. The set Z + of positive integers 1, 2, 3, . . . is unbounded above. 

Proof. If Z + were bounded above then Z + would have a supremum, say a = 
sup Z + . By Theorem 1.14 we would have a — 1 < n for some n in Z + . Then 
n + 1 > a for this n. Since n + 1 e Z + this contradicts the fact that a = supZ + . 

Theorem 1.18. For every real x there is a positive integer n such that n > x. 

Proof. If this were not true, some x would be an upper bound for Z + , contra- 
dicting Theorem 1.17. 


1.14 THE ARCHIMEDEAN PROPERTY OF THE REAL NUMBER SYSTEM 

The next theorem describes the Archimedean property of the real number system. 
Geometrically, it tells us that any line segment, no matter how long, can be 
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covered by a finite number of line segments of a given positive length, no matter 
how small. 

Theorem 1.19. If x > 0 and if y is an arbitrary real number, there is a positive 
integer n such that nx > y. 

Proof. Apply Theorem 1.18 with x replaced by yjx. 


1.15 RATIONAL NUMBERS WITH FINITE DECIMAL REPRESENTATION 
A real number of the form 


r = 



+ 

10 10 2 


+ ••• + 


On 

10 "’ 


where a 0 is a nonnegative integer and a u . . . , a n are integers satisfying 0 < a, <; 9, 
is usually written more briefly as follows : 


r = a 0 . a t a 2 • • • a n . 

This is said to be a finite decimal representation of r. For example, 




— = 7 + \ 

4 10 10 2 


= 7.25. 


Real numbers like these are necessarily rational and, in fact, they all have the form 
r = a 1 10", where a is an integer. However, not all rational numbers can be ex- 
pressed with finite decimal representations. For example, if £ could be so expressed, 
then we would have J = a/ 10" or 3 a = 10" for some integer a. But this is im- 
possible since 3 does not divide any power of 10. 


1.16 FINITE DECIMAL APPROXIMATIONS TO REAL NUMBERS 

This section uses the completeness axiom to show that real numbers can be 
approximated to any desired degree of accuracy by rational numbers with finite 
decimal representations. 

Theorem 1.20. Assume jc > 0. Then for every integer n ^ 1 there is a finite 
decimal r n = a Q . a t a 2 • • • a H such that 


<, x < r H + 


10 "' 


Proof. Let S be the set of all nonnegative integers <x. Then S is nonempty, 
since 0 e S, and S is bounded above by x. Therefore S has a supremum, say 
a 0 = sup S. It is easily verified that a 0 e S, so a 0 is a nonnegative integer. We 
call a 0 the greatest integer in x, and we write a 0 = [x]. Clearly, we have 

a 0 £ x < a 0 + 1. 
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Now let a x = [lOx — 10a 0 ], the greatest integer in 10* — 10a o . Since 
0 ^ lOx — 10a o = 10(x — a 0 ) < 10, we have 0 <■ a x <, 9 and 

a x ^ 10x — 10a 0 < + 1. 

In other words, a x is the largest integer satisfying the inequalities 

fli fl, + 1 

a 0 + — < x < a 0 + — . 

10 10 


More generally, having chosen a u ... , a„- x with 0 < a, ^ 9, let a„ be the 
largest integer satisfying the inequalities 



< x < a 0 + 




Then 0 <; a„ < 9 and we have 

r„< x < r „ + 


10 "’ 


where r n = a 0 . a x a 2 • • • a„. This completes the proof. It is easy to verify that x is 
actually the supremum of the set of rational numbers r u r 2 , 


1.17 INFINITE DECIMAL REPRESENTATIONS OF REAL NUMBERS 

The integers a 0 , a u a 2 , . . . obtained in the proof of Theorem 1.20 can be used to 
define an infinite decimal representation of x. We write 

x = a 0 .a x a 2 • • • 

to mean that a n is the largest integer satisfying (4). For example, if x = £ we find 
a 0 = 0, a x = 1, a 2 — 2, a 3 = 5, and a„ = 0 for all n ;> 4. Therefore we can 
write 

i = 0.125000 • • • 

If we interchange the inequality signs <; and < in (4), we obtain a slightly 
different definition of decimal expansions. The finite decimals r H satisfy r n < x < 
r H + 10“" although the digits a 0 , a u a 2 , . . . need not be the same as those in (4). 
For example, if we apply this second definition to x = £ we find the infinite decimal 
representation 

i = 0.124999 • • • 

The fact that a real number might have two different decimal representations is 
merely a reflection of the fact that two different sets of real numbers can have the 
same supremum. 


1.18 ABSOLUTE VALUES AND THE TRIANGLE INEQUALITY 

Calculations with inequalities arise quite frequently in analysis. They are of 
particular importance in dealing with the notion of absolute value. If x is any real 
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number, the absolute value of x , denoted by |x|, is defined as follows: 

. , _ f x, if x ^ 0, 

\-x, if x < 0. 

A fundamental inequality concerning absolute values is given in the following: 

Theorem 1.21. If a > 0, then we have the inequality |x| < a if, and only if, 
—a x <, a. 

Proof From the definition of |x|, we have the inequality — |x| < x < |x|, since 
x = |x|orx = — |x|. If we assume that |x| < a, then we can write —a < — |x| < 
x <, |x| < a and thus half of the theorem is proved. Conversely, let us assume 
—a<x<,a. Then if x > 0, we have |x| = x < a, whereas if x < 0, we have 
|x| = —x<a. In either case we have |x| <, a and the theorem is proved. 

We can use this theorem to prove the triangle inequality. 

Theorem 1.22. For arbitrary real x and y we have 

|x + y| < |x| + |y| ( the triangle inequality ). 

Proof We have — |x| < x < |x| and — |y| < y < |y|. Addition gives us 
— (M + |y|) < x + y < |x| + |y|, and from Theorem 1.21 we conclude that 
|x + y\ <. |x| + |y|. This proves the theorem. 

The triangle inequality is often used in other forms. For example, if we take 
x = a — c and y = c — b in Theorem 1.22 we find 

\a — b\ < \a — c| + \c — b\. 

Also, from Theorem 1.22 we have |x| > |x + y\ — |y|. Taking x = a + b, 
y = —b, we obtain 

\a + b\ > |u| — |6|. 

Interchanging a and b we also find \a + b\ > |i| — |a| = — (|a| — |6|), and 
hence 

l« + *1 ^ |N - N|. 

By induction we can also prove the generalizations 

\*1 + x 2 + • • • + xj ^ |x,| + |x 2 | + • • • + |x„| 
and 

|Xi + x 2 + • ■ • + xj > |x t | - |x 2 | - • • • - |X B |. 

1.19 THE CAUCHY-SCHWARZ INEQUALITY 

We shall now derive another inequality which is often used in analysis. 
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Theorem 1.23 (Cauchy-Schwarz inequality). If and b u ...,b n are 

arbitrary real numbers, we have 

Moreover , if some a i ^ 0 equality holds if and only if there is a real x such that 
a k x + b k = 0 for each k = 1,2 ^ , n. 

Proof A sum of squares can never be negative. Hence we have 


2 + bk)2 - 0 
k = 1 

for every real x , with equality if and only if each term is zero. This inequality can 
be written in the form 

Ax^ -f 2.Bx H- C > 0, 

where 

A = ^ B = ^ C = ^ 

fc=l k= 1 fc= 1 

If ^4 > 0, put a; = — BjA to obtain B 2 — AC < 0, which is the desired inequality. 
If A = 0, the proof is trivial. 


note. In vector notation the Cauchy-Schwarz inequality takes the form 

(a-b) 2 < ||a|| 2 ||b|| 2 , 

where a = (a l , , a n ), b = (b 1 , ... , b n ) are two n-dimensional vectors, 


n 

a- b = ]^ a k b k , 

k= 1 

is their dot product, and ||a|| = (a* a) 1/2 is the length of a. 


1.20 PLUS AND MINUS INFINITY AND THE EXTENDED REAL NUMBER 
SYSTEM R* 

Next we extend the real number system by adjoining two “ideal points” denoted 
by the symbols + oo and — oo (“plus infinity” and “minus infinity”). 

Definition 1.24. By the extended real number system R* we shall mean the set of 
real numbers R together with two symbols + oo and — oo which satisfy the following 
properties: 

a) If x e R, then we have 

X + (+00) = +00, 

x — ( + go} = — 00 , 

xj(+co) = xj(-co) = 0. 


X + ( — oo) = — 00, 
X — (— oo) = +00, 
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b) If x > 0, then we have 

x(+oo) = + 00 , x(— 00 ) = — 00 . 

c) If x < 0, then we have 

x(+oo) = — oo, x(—oo) = 4 - 00 . 

d) (4- co) 4- (4- oo) = (4-oo)( + co) = (— oo)(—oo) = 4-oo, 

(— oo ) 4- (— oo) = (4-oo)(—oo) = — oo. 

e) If x e R, then we have — oo < x < 4- oo. 

notation. We denote R by (— oo, 4- oo) and R* by [ — oo, 4- oo]. The points in R 
are called “finite” to distinguish them from the “infinite” points 4-00 and — oo. 

The principal reason for introducing the symbols 4-oo and — oo is one of 
convenience. For example, if we define 4- oo to be the sup of a set of real numbers 
which is not bounded above, then every nonempty subset of R has a supremum 
in R*. The sup is finite if the set is bounded above and infinite if it is not bounded 
above. Similarly, we define — oo to be the inf of any set of real numbers which is 
not bounded below. Then every nonempty subset of R has an inf in R*. 

For some of the later work concerned with limits, it is also convenient to 
introduce the following terminology. 

Definition 1.25. Every open interval (a, 4- oo) is called a neighborhood of 4- oo or 
a ball with center 4-oo. Every open interval (— oo , a) is called a neighborhood of 
— oo or a ball with center — oo. 


1.21 COMPLEX NUMBERS 

It follows from the axioms governing the relation < that the square of a real 
number is never negative. Thus, for example, the elementary quadratic equation 
x 2 — —l has no solution among the real numbers. New types of numbers, called 
complex numbers, have been introduced to provide solutions to such equations. It 
turns out that the introduction of complex numbers provides, at the same time, 
solutions to general algebraic equations of the form 

a 0 + a^x 4- • • • 4- aye" = 0, 

« 

where the coefficients a 09 a l9 ... 9 a H are arbitrary real numbers. (This fact is 
known as the Fundamental Theorem of Algebra .) 

We shall now define complex numbers and discuss them in further detail. 

Definition 1.26 . By a complex number we shall mean an ordered pair of real numbers 
which we denote by (x l5 x 2 ). The first member , x l9 is called the real part of the 
complex number; the second member , x 2 , is called the imaginary part . Two complex 
numbers x = (x l9 x 2 ) and y = (y l9 y 2 ) are called equal , and we write x = y, if 
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and only if, x t = y x and x 2 = y 2 • We define the sum x + y and the product xy by 
the equations 

x + y = (*1 + y„ x 2 + y 2 ), xy = (x^ - x 2 y 2 , x,y 2 + x 2 y r ). 

note. The set of all complex numbers will be denoted by C. 

Theorem 1.27 . The operations of addition and multiplication just defined satisfy 
the commutative, associative, and distributive laws. 

Proof. We prove only the distributive law; proofs of the others are simpler. If 
x = (x„ x 2 ), y = (y„ y 2 ), and z = (z„ z 2 ), then we have 

x(y + z) = (x„ x 2 )(y, + z u y 2 + z 2 ) 

= + X 1 z 1 - x 2 y 2 - x 2 z 2 , x,y 2 + x,z 2 + x 2 y t + x 2 z,) 

= (* 1^1 - x 2 y 2 , x,y 2 + x 2 y,) + (xjZ, - x 2 z 2 , x x z 2 + x 2 z,) 

= xy + xz. 

Theorem 1.28. 

(xj, x 2 ) + (0, 0) = (x„ x 2 ), (xj, x 2 )(0, 0) = (0, 0), 

(xj, x 2 )(l, 0) = (xj, x 2 ), (x„ x 2 ) + (-x„ x 2 ) = (0, 0). 

Proof. The proofs here are immediate from the definition, as are the proofs of 
Theorems 1.29, 1.30, 1.32, and 1.33. 

Theorem 1.29. Given two complex numbers x = (xj, x 2 ) and y = (y u y 2 ), there 
exists a complex number z such that x + z = y. In fact, z = (y s — x 1( y 2 — x 2 ). 
This z is denoted by y — x. The complex number (— x,, — x 2 ) is denoted by —x. 

Theorem 130. For any two complex numbers x and y, we have 

(~x)y = x(—y) = — (xy) = (-1, 0)(xy). 

Definition 1.31. If x = (Xj, x 2 ) ^ (0, 0) and y are complex numbers, we define 
X~ x = [x ifix\ + x\), -x 2 Hx\ + x|)], and y/x = yx -1 . 

Theorem 1.32. If x and y are complex numbers with x ^ (0, 0), there exists a 
complex number z such that xz = y, namely, z = yx -1 . 

Of special interest are operations with complex numbers whose imaginary 
part is 0. 

Theorem 1.33. (xj, 0) + (y,, 0) = (x, + y,, 0), 

(*i> 0)(yj, 0) = (x,y„0), 

(xi, 0 )/(y„ 0) = (x,/y„ 0), if y, ^ 0. 

note. It is evident from Theorem 1 .33 that we can perform arithmetic operations 
on complex numbers with zero imaginary part by performing the usual real-num- 
ber operations on the real parts alone. Hence the complex numbers of the form 
(x, 0) have the same arithmetic properties as the real numbers. For this reason it is 
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convenient to think of the real number system as being a special case of the complex 
number system, and we agree to identify the complex number (x, 0) and the real 
number*. Therefore, we write * = (*, 0). In particular, 0 = (0, 0)and 1 = (1, 0). 

1.22 GEOMETRIC REPRESENTATION OF COMPLEX NUMBERS 

Just as real numbers are represented geometrically by points on a line, so complex 
numbers are represented by points in a plane. The complex number * = (*j , x 2 ) 
can be thought of as the “point” with coordinates (*i, x 2 ). When this is done, the 
definition of addition amounts to addition by the parallelogram law. (See Fig. 1 .2.) 



Figure 1.2 


The idea of expressing complex numbers geometrically as points on a plane 
was formulated by Gauss in his dissertation in 1799 and, independently, by Argand 
in 1806. Gauss later coined the somewhat unfortunate phrase “complex number.” 
Other geometric interpretations of complex numbers are possible. Instead of 
using points on a plane, we can use points on other surfaces. Riemann found the 
sphere particularly convenient for this purpose. Points of the sphere are projected 
from the North Pole onto the tangent plane at the South Pole and thus there 
corresponds to each point of the plane a definite point of the sphere. With the 
exception of the North Pole itself, each point of the sphere corresponds to exactly 
one point of the plane. This correspondence is called a stereographic projection. 
(See Fig. 1.3.) 



Figure 1.3 
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1.23 THE IMAGINARY UNIT 

It is often convenient to think of the complex number (jc x , x 2 ) as a two-dimensional 
vector with components x l and x 2 . Adding two complex numbers by means of 
Definition 1 .26 is then the same as adding two vectors component by component. 
The complex number 1 = (1,0) plays the same role as a unit vector in the hori- 
zontal direction. The analog of a unit vector in the vertical direction will now be 
introduced. 

Definition 1.34. The complex number (0, 1) is denoted by i and is called the imag- 
inary unit. 

Theorem 1.35 . Every complex number x = (xj, x 2 ) can be represented in the form 
x = x t + ix 2 . 

Proof. x t = (x t , 0), ix 2 = (0, l)(x 2 , 0) = (0, x 2 ), 

X 1 + ix 2 = (* 1 > 0) + (0, X 2 ) = (Xj, x 2 ). 

The next theorem tells us that the complex number i provides us with a solution 
to the equation x 2 = — 1 . 

Theorem 1.36. i 2 = —1. 

Proof, i 2 = (0, 1X0, 1) = (-1, 0) = -1. 

1.24 ABSOLUTE VALUE OF A COMPLEX NUMBER 

We now extend the concept of absolute value to the complex number system. 

Definition 1.37. If x = (x l , x 2 ), we define the modulus, or absolute value, of x to 
be the nonnegative real number |x| given by 

|x| = v x\ + X 2 . 


Theorem 1.38. 

0 1(0, 0)1 = Q, and |x| > 0 if x # 0. ii) |xy| = |x| |y|. 
iii) \x/y\ = |x|/|y|, if y ± 0. iv) |(x,, 0)| = |x,|. 

Proof Statements (i) and (iv) are immediate. To prove (ii), we write x = x t + ix 2 , 
y = y x + iy 2 , so that xy = Xjy, — x 2 y 2 + z(x,y 2 + x 2 y t ). Statement (ii) 
follows from the relation 

\xy \ 2 = Ayl + AA + A A + x bl = (A + x D(y 2 i + A) = l^l 2 M 2 - 

Equation (iii) can be derived from (ii) by writing it in the form |x| = |y| |x/y|. 

Geometrically, |x| represents the length of the segment joining the origin to 
the point x. More generally, |x — y\ is the distance between the points x and y. 
Using this geometric interpretation, the following theorem states that one side of 
a triangle is less than the sum of the other two sides. 
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Theorem 1.39 . If x and y are complex numbers , then we have 

\x + y\ < |x| + \y\ ( triangle inequality ). 

The proof is left as an exercise for the reader. 

1.25 IMPOSSIBILITY OF ORDERING THE COMPLEX NUMBERS 

As yet we have not defined a relation of the form x < y if x and y are arbitrary 
complex numbers, for the reason that it is impossible to give a definition of < for 
complex numbers which will have all the properties in Axioms 6 through 8. To 
illustrate, suppose we were able to' define an order relation < satisfying Axioms 
6, 7, and 8. Then, since i ^ 0, we must have either i > 0 or / < 0, by Axiom 6. 
Let us assume i > 0. Then taking, x = y = i in Axiom 8, we get i 2 > 0, or 
— 1 > 0. Adding 1 to both sides (Axiom 7), we get 0 > 1. On the other hand, 
applying Axiom 8 to — 1 > 0 we find 1 > 0. Thus we have both 0 > 1 and 
1 > 0, which, by Axiom 6, is impossible. Hence the assumption i > 0 leads us 
to a contradiction. [Why was the inequality — 1 > Onot already a contradiction?] 
A similar argument shows that we cannot have i < 0. Hence the complex numbers 
cannot be ordered in such a way that Axioms 6, 7, and 8 will be satisfied. 


1.26 COMPLEX EXPONENTIALS 

The exponential e x (x real) was mentioned earlier. We now wish to define e 2 when 
z is a complex number in such a way that the principal properties of the real 
exponential function will be preserved. The main properties of e x for x real are 
the law of exponents, e Xl e X2 = e Xl+X2 , and the equation e° = 1. We shall give a 
definition of e z for complex z which preserves these properties and reduces to the 
ordinary exponential when z is real. 

If we write z = x + iy (x, y real), then for the law of exponents to hold we 
want e x+iy = e x e iy . It remains, therefore, to define what we shall mean by e iy . 

Definition 1.40. If z = x + iy, we define e z = e x+iy to be the complex number 
e 2 = e x (cos y + i sin y). 

This definition* agrees with the real exponential function when z is real (that 
is, y = 0). We prove next that the law of exponents still holds. 


* Several arguments can be given to motivate the equation e iy = cos y + i sin y. For 
example, let us write e xy — f(y) + ig(y) and try to determine the real-valued functions / 
and g so that the usual rules of operating with real exponentials will also apply to complex 
exponentials. Formal differentiation yields e iy = g'(y) — if'(y), if we assume that 
( e iy )' = ie iy . Comparing the two expressions for e iy , we see that / and g must satisfy the 
equations /(y) = g'(y),f'(y) = - g(y ). Elimination of g yields /(y) = -f"(y). Since 
we want e° = 1, we must have /( 0) = 1 and /'( 0) = 0. It follows that /(y) = cos y and 
g(y) = —f'(y) = sin y. Of course, this argument proves nothing, but it strongly suggests 
that the definition e iy = cos y + i sin y is reasonable. 
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Theorem 1.41 . If z 1 = x x + iy x and z 2 = x 2 + iy 2 are two complex numbers , 
fAew we have 

e Zv e Z2 = e 2l+22 . 


Proof 


e Zl = e Xl (cos .y! + i sin j^), e 22 = e* 2 (cos y 2 + i sin y 2 ), 

e 21 e 22 = ^ cos y 2 __ sin ^ s j n 

+ i(cos sin ^2 + sin J'i cos J^)]- 
Now e Xl e* 2 = e Xl+X2 , since xj t and jc 2 are both real. Also, 

cos y x cos y 2 — sin y x sin y 2 = cos (y, + y 2 ) 


and 


and hence 


cos y x sin y 2 + sin y x cos y 2 = sin (^1 + yi\ 


e Zi e Z2 = e Xi+X2 [cos (y x + y 2 ) + i sin (j^ + y 2 )] = e 2l+22 . 


1.27 FURTHER PROPERTIES OF COMPLEX EXPONENTIALS 
In the following theorems, z, z l9 z 2 denote complex numbers. 

Theorem 1.42 . e 2 iy never zero. 

1 

Proof. e z e~ z = e° = 1. Hence e z cannot be zero. 

Theorem 1.43. If x is real, then \e‘ x \ = 1. 

Proof \e ,x \ 2 = cos 2 x + sin 2 x = 1, and |e ix | > 0. 

Theorem 1.44. e z — 1 if and only if, z is an integral multiple 6f 2ni. 

Proof. If z = 2nin, where n is an integer, then 

e z = cos (2nn) + i sin (2nn) = 1 . 

Conversely, suppose that e z = 1 . This means that e x cos y = 1 and e x sin y = 0. 
Since e x ^ 0, we must have sin y = 0, y = kn, where k is an integer. But 
cos (Jen) = (— 1)*. Hence e x = ( — 1 ) fc , since e x cos (kn) — 1. Since e x > 0, 
k must be even. Therefore e x = 1 and hence x = 0. This proves the theorem. 

Theorem 1.45. e Zi = e Z2 if, and only if, z t — z 2 = 2nin ( where n is an integer). 

Proof. e Zl = e Zz if, and only if, e Zl ~ Z2 = 1. 


1.28 THE ARGUMENT OF A COMPLEX NUMBER 

If the point z -= (jc, y) = x + iy is represented by polar coordinates r and 6, we 
can write x = r cos 0 and y = r sin 0, so that z = r cos 0 + ir sin 0 = re' 6 . 
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The two numbers r and 9 uniquely determine z. Conversely, the positive number 
r is uniquely determined by z; in fact, r = |z|. However, z determines the angle 0 
only up to multiples of 2 n. There are infinitely many values of 9 which satisfy the 
equations x = |z | cos 9, y = |z | sin 9 but, of course, any two of them differ by 
some multiple of 2n. Each such 9 is called an argument of z but one of these values 
is singled out and is called the principal argument of z. 

Definition 1.46. Let z = x + iy be a nonzero complex number. The unique real 
number 9 which satisfies the conditions 


I x = |z| cos 9, y = |z| sin 9, —n < 9 < + n 

is called the principal argument of z, denoted by 9 = arg (z). 

The above discussion immediately yields the following theorem : 

Theorem 1.47. Every complex number z / 0 can be represented in the form 
z = re' 6 , where r = |z | and 9 = arg (z) + 2nn, n being any integer. 

note. This method of representing complex numbers is particularly useful in 
connection with multiplication and division, since we have 


(r ie i9 ')(r 2 e i9 >) = r l r 2 i* , ' + ™ and 


r,e 


i» l 


r-,e 


102 


e W 1-02) 

r 2 


Theorem 1.48. If z,z 2 # 0, then arg (z t z 2 ) = arg (z t ) + arg (z 2 ) + 27rn(z 1 , z 2 ), 
where 


n{z i, z 2 ) = 



-n < arg (z,) + arg (z 2 ) < +jr, 
-2?r < arg (z,) + arg (z 2 ) < -n, 
if n < arg (z x ) + arg (z 2 ) < 2tt. 


Proof. Write z, = |z,|c i#l , z 2 = |z 2 |c i02 , where 0, = arg (z,) and 9 2 = arg (z 2 ). 
Then z t z 2 = |z 1 z 2 |e ,(#1+#2) . Since -n < 0 t < and -it < 0 2 < + 7 r, we 
have — 2^ < 0 X + 0 2 < 2 tt. Hence there is an integer n such that —n<0 l + 
0 2 + 2nn < n. This n is the same as the integer n(z t , z 2 ) given in the theorem, 
and for this n we have arg (z t z 2 ) = 0, + 0 2 + 2nn. This proves the theorem. 


1.29 INTEGRAL POWERS AND ROOTS OF COMPLEX NUMBERS 

Definition 1.49. Given a complex number z and an integer n, we define the «th power 
of z as follows: 


z° = 1, z" +1 = z n z, ifn > 0, 

z~ n = (z -1 )”, if z ^ 0 and n > 0. 

Theorem 1.50, which states that the usual laws of exponents hold, can be proved 
by mathematical induction. The proof is left as an exercise. 
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Theorem 1.50 . Given two integers m and n, we have , for z # 0, 

zTzT = z n+m and (z 1 z 2 ) n = zjzj. 

Theorem 1.51. If z # 0, and if n is a positive integer , then there are exactly n 
distinct complex numbers z 0 , z 1? . . . , z n _ x {called the nth roots of z), such that 

\ z k = z, for each k = 0, 1, 2, . . . , n — 1. 

Furthermore , fAeje raota are the formulas 

z k = Re i<t>k , where R = |z| 1/n , 

and 

^ = arg(z) + 27tfc (fc = o, 1,2, - 1). 

note. The n nth roots of z are equally spaced on the circle of radius R = |z| 1/B , 

center at the origin. 

Proof. The n complex numbers Re l ^ k , 0 < k < n — 1, are distinct and each is 
an nth root of z, since 

(Re‘^ k ) n = R n e'”^ k — |z |e i '- arg + lnk ^ = z. 

We must now show that there are no other nth roots of z. Suppose w — Ae“* is 
a complex number such that w" = z. Then |w|" = |z|, and hence A” = \z\, 
A = |z| 1/n . Therefore, w" = z can be written e‘" x = e i[arg(z)3 , which implies 

na — arg (z) = 2^ for some integer k. 

Hence a = [arg (z) + Ink^/n. But when k runs through all integral values, w 
takes only the distinct values z 0 , . . . , z„- 1 . (See Fig. 1.4.) 



1.30 COMPLEX LOGARITHMS 

By Theorem 1.42, e z is never zero. It is natural to ask if there are other values 
that e* cannot assume. The next theorem shows that zero is the only exceptional 
value. 
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Theorem 1.52 . If z is a complex number ^ 0, then there exist complex numbers w 
such that e w = z. One such w is the complex number 


log \z\ ■+■ i arg (z), 

and any other such w must have the form 


I log |z| ■+■ i arg (z) + 2nni, 

where n is an integer. 

Proof Since e lo Sl 2 l +iar S( z > = e io * M^iargCz) _ | z |^iarg(z) _ ^ we §ee w __ 

log |z| + i arg (z) is a solution of the equation e w = z . But if is any other 
solution, then e w = e Wl and hence w — w x = 2«7t/. 

Definition 1.53 . Lef z ^ 0 be a given complex number. If w is a complex number 
such that e w = z, f/iew w is called a logarithm of z. The particular value of w given 
by 

w = log |z| + i arg (z) 

is called the principal logarithm of z, and for this w we write 

t 

w = Log z. 

Examples 

1. Since |i| = 1 and arg (i) = rc/2. Log (i) = in/2. 

2. Since | — 1 | = 1 and arg (— i) = — n/2, Log(— i) = —in/2. 

3. Since | — 1 1 = 1 and arg (—1) = n , Log (— 1) = ni. 

4. If x > 0, Log (*) = log x , since |*| = x and arg (jc) = 0. 

5. Since |1 + * | = V 2 and arg (1 + i) = nj 4, Log (1 + 0 = log V 2 + in/4. 

Theorem 1.54. If z x z 2 ^ 0, f/iew 

Log (Z!Z 2 ) = Log z x + Log z 2 + 2nin(z u z 2 ), 
where n(z u z 2 ) is the integer defined in Theorem 1.48 . 

Proof 

Log (ziz 2 ) = log |z x z 2 1 + i arg (z { z 2 ) 

= log | zj + log |z 2 | + i [arg (z^ + arg (z 2 ) + 2nn(z u z 2 )]. 


1.31 COMPLEX POWERS 

Using complex logarithms, we can now give a definition of complex powers of 
complex numbers. 

Definition 1.55. If z # 0 and if w is any complex number, we define 
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Examples 

ji _ gi Log I = e Hinf2) = ^“«/2 

2. (-1) 1 = ^Log(-l) = e i(i«) = e -n 

3. If n is an integer, then z n+1 = e {n+1)Logz = e nLog z e Logz = z”z, so Definition 1.55 does 
not conflict with Definition 1.49. 


The next two theorems give rules for calculating with complex powers : 
Theorem 1.56. z Wl z W2 = z Wl+W2 if z ^ 0. 

Proof z Wl+Wl = e (Wl+W2)Lo * z = e Wl Lo * z ^ W2 Lo & z — z Wl z W2 
Theorem 1.57. If z x z 2 ^ 0, then 

(z 1 z 2 ) w = zYz^ 27tiwn(2l ’ Z2) , 
where n(z u z 2 ) £s integer defined in Theorem 1.48. 

Proof. (z 1 z 2 ) w = Log (ziz 2 ) __ g\v [Log zi + Log Z 2 + 2ni n(zi, Z 2 )] 


132 COMPLEX SINES AND COSINES 

Definition 1.58. Given a complex number z, we define 

e iz + e~ iz . e iz - e~ iz 
cos z = , sin z = . 

2 2i 

note. When z is real, these equations agree with Definition 1 .40. 
Theorem 1.59. If z = x + iy , then we have 

cos z = cos x cosh y — i sin x sinh y , 
sin z = sin x cosh + / cos x sinh y. 

Proof. 

2 cos z = e lz + e“ IZ 

= e _y (cos x + i sin x) + ^(cos x — i sin x) 

= cos x(e y + e~ y ) — i sin x(e y — e~ y ) 

= 2 cos x cosh y — 2 i sin x sinh y. 

The proof for sin z is similar. 

Further properties of sines and cosines are given in the exercises. 


1.33 INFINITY AND THE EXTENDED COMPLEX PLANE C* 

Next we extend the complex number system by adjoining an ideal point denoted by 
the symbol oo. 

Definition 1.60. By the extended complex number system C* we shall mean the 
complex plane C along with a symbol oo which satisfies the following properties: 

a) If z e C, then we have z+oo=z — oo = oo, z/ oo = 0. 
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b) If z e C, but z / 0, then z(oo) = oo and zj 0 = oo. 

c) 00 + 00 = (oo)(oo) = 00 . 

Definition 1.61 . Every set in C of the form {z : \z \ > r > 0} is called a neighbor- 
hood of oo, or a ball with center at oo. 

The reader may wonder why two symbols, + oo and — oo, are adjoined to R 
but only one symbol, oo, is adjoined to C. The answer lies in the fact that there is 
an ordering relation < among the real numbers, but no such relation occurs 
among the complex numbers. In order that certain properties of real numbers 
involving the relation < hold without exception, we need two symbols, + oo and 
— oo, as defined above. We have already mentioned that in R* every nonempty 
set has a sup, for example. 

In C it turns out to be more convenient to have just one ideal point. By way 
of illustration, let us recall the stereographic projection which establishes a one- 
to-one correspondence between the points of the complex plane and those points 
on the surface of the sphere distinct from the North Pole. The apparent exception 
at the North Pole can be removed by regarding it as the geometric representative 
of the ideal point oo. We then get a one-to-one correspondence between the 
extended complex plane C* and the total surface of the sphere. It is geometrically 
evident that if the South Pole is placed on the origin of the complex plane, the 
exterior of a “large” circle in the plane will correspond, by stereographic projection, 
to a “small” spherical cap about the North Pole. This illustrates vividly why we 
have defined a .neighborhood of oo by an inequality of the form \z\ > r. 


EXERCISES 


Integers 

1.1 Prove that there is no largest prime. (A proof was known to Euclid.) 

1.2 If n is a positive integer, prove the algebraic identity 

ff — 1 

a n - b n = (a - b)J2 a k b n ' l ~ k . 

k = 0 

1.3 If 2" — 1 is prime, prove that n is prime. A prime of the form 2 P — 1, where p is 
prime, is called a Mersenne prime . 

1.4 If 2" 4- 1 is prime, prove that n is a power of 2. A prime of the form 2 2m -1- 1 is 
called a Fermat prime. Hint. Use Exercise 1.2. 

1.5 The Fibonacci numbers 1, 1, 2, 3, 5, 8, 13, ... are defined by the recursion formula 

x n+i = x n -f with x x = x 2 = 1. Prove that (x„, x n+1 ) = 1 and that x n — 

( a n — b n )/(a — b), where a and b are the roots of the quadratic equation x 2 — x - 1 = 0. 

1.6 Prove that every nonempty set of positive integers contains a smallest member. 
This is called the well-ordering principle. 
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Rational and irrational numbers 

1.7 Find the rational number whose decimal expansion is 0.3344444 . . . 

1.8 Prove that the decimal expansion of x will end in zeros (or in nines) if, and only if, 
x is a rational number whose denominator is of the form 2"5 m , where m and n are non- 
negative integers. 

1.9 Prove that yjl -f V3 is irrational. 

1.10 If a , b , c , d are rational and if x is irrational, prove that (ax -f b)/(cx -f d) is usually 
irrational. When do exceptions occur? 

1.11 Given any real x > 0, prove that there is an irrational number between 0 and x. 

1.12 If a/b < c/d with b > 0, d > 0, prove that (a -f c)/(b -f d) lies between alb 
and c/d . 

1.13 Let a and b be positive integers. Prove that V 2 always lies between the two fractions 
a/b and (a -f 26)/(a -f 6). Which fraction is closer to V2? 

1.14 Prove that \1 n — 1 + V/i+lis irrational for every integer n > 1. 

1.15 Given a real x and an integer N > 1, prove that there exist integers h and k with 
0 < k < N such that \kx — h\ < 1/N. Hint. Consider the N -f 1 numbers tx — [tx] 
for t = 0, 1, 2, . . . , N and show that some pair differs by at most 1 /N. 

1.16 If x is irrational prove that there are infinitely many rational numbers hjk with 
k > 0 such that \x — hfk\ < 1 Ik 2 . Hint. Assume there are only a finite number 
h\fk u . . . , h r jk r and obtain a contradiction by applying Exercise 1.15 with N > 1 /S, 
where 3 is the smallest of the numbers \x — hjkil. 

1.17 Let x be a positive rational number of the form 


x 


= E 

lr-1 


Ok 

k\ 


» 


where each a k is a nonnegative integer with a k < k — 1 for k > 2 and a n > 0. Let [x] 
denote the greatest integer in x. Prove that a x = [x], that a k = [k\ x] — k[(k — 1)! x] 
for k = 2, . . . , n, and that n is the smallest integer such that n\ x is an integer. Con- 
versely, show that every positive rational number x can be expressed in this form in one 
and only one way. 


Upper bounds 

1.18 Show that the sup and inf of a set are uniquely determined whenever they exist. 

1.19 Find the sup and inf of each of the following sets of real numbers: 

a) All numbers of the form 2~ p -f 3~ q -f 5“ r , where p , q , and r take on all positive 
integer values. 

b) S = {* : 3a: 2 — 10a: -f 3 < 0}. 

c) S = {a: : (at — a)(x — b)(x — c)(x — d) < 0}, where a < b < c < d. 

1.20 Prove the comparison property for suprema (Theorem 1.16). 

1.21 Let A and B be two sets, of positive numbers bounded above, and let a = sup A , 
b = sup B. Let C be the set of all products of the form xy , where x e A and y e B. 
Prove that ab = sup C. 
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1.22 Given x > 0 and an integer k > 2. Let a Q denote the largest integer <x and, 

assuming that a 0 , • • • » a n-i have been defined, let a n denote the largest integer such 

that 


a) 

b) 


flo + - 1 + % + 
k k 2 


+ ^ 
k n 


Prove that 0 < a t < k - 1 for each / = 1, 2, 

Lei r n = a o 1 + tj 2 k 2 + • • • + a n k 

set of rational numbers r u r 2 , . . . 


— n 


< X. 


and show that x is the sup of the 


note. When k = 10 the integers a 0 , a l9 a 2 , . . . are the digits in a decimal representation 
of x. For general k they provide a representation in the scale of k. 


Inequalities 

1.23 Prove Lagrange's identity for real numbers: 

(2 oa ) = (2 «*)( 2 - 2 (a ^ b J - a Af- 

\*=1 / \k— 1 / \*= 1 / l*k<j*n 

Note that this identity implies the Cauchy-Schwarz inequality. 

1.24 Prove that for arbitrary real a k , b k , c k we have 

4 


(± «a,) * (£ 4 )(g 4 ) ■ 


1.25 Prove Minkowski’s inequality: 

\ 1/2 


( » \ 1/2 in \ 1/2 / n \ 1/2 

E(o. + w 2 ) +(g«) • 

This is the triangle inequality ||a + b|| < ||a|| + |b|| for n-dimensional vectors, where 
a = (a u b = (b u b„) and 


Hall 


( n \ 1/2 

S* 1 • 


1.26 If a l > a 2 > • • • > a n and b 1 > b 2 > • • • > b n , prove that 


(2 a *)(2 6 *) - « 2 akbk - 

\k - 1 / \fc= 1 / k - 1 


Hint . Xl £j£k£n ( a k a j)(bk “ ^/) — 0- 


Complex numbers 

1.27 Express the following complex numbers in the form a 4- bi. 

a) (1 + i) 3 , b) (2 + 3/)/(3 - 4/), 

c) i 5 + i 16 , d) i(l + /)(1 + r 8 ). 

1.28 In each case, determine all real x and y which satisfy the given relation. 

100 

a) x + iy = \x - />|, b) x + iy = (x - iy) 2 , c ) ^ i k = x + />. 

k^O 
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1.29 If z = x + iy, x and y real, the complex conjugate of z is the complex number 
z = x — iy. Prove that : 

a) z i + z 2 = Zi + z 2 , b) z 1 z 2 = z^, c) zz = |z| 2 , 

d) z + z = twice the real part of z, 

e) (z — z)// = twice the imaginary part of z. 

1.30 Describe geometrically the set of complex numbers z which satisfies each of the 
following conditions: 

a) |z| = 1, b) |z| < 1, c) |z| < 1, 

d) z + z = 1, e) z — z = /, f) z + z = |z| 2 . 

1.31 Given three complex numbers z u z 2 , z 3 such that \z x \ = |z 2 | = |z 3 | = 1 and 
Zi + z 2 + z 3 = 0. Show that these numbers are vertices of an equilateral triangle 
inscribed in the unit circle with center at the origin. 

1.32 If a and b are complex numbers, prove that: 

a) | a — b\ 2 < (1 + M 2 )(l + \b\ 2 ). 

b) If a ^ 0, then \a + b\ = \a\ + |6| if, and only if, bja is real and nonnegative. 

1.33 If a and b are complex numbers, prove that 

\a — b\ = |1 — ab\ 

if, and only if, \a\ = 1 or \b\ = 1. For which a and b is the inequality \a - b\ < |1 - ab\ 
valid? 

1.34 If a and c are real constants, b complex, show that the equation 

azz + bz + Ez + c = 0 (a ^ 0, z = x + iy) 
represents a circle in the xy-plane. 

1.35 Recall the definition of the inverse tangent: given a real number t, tan -1 (f) is the 
unique real number 6 which satisfies the two conditions 

— - < 0 < + - , tan 0 = t. 

2 2 


If z = x + iy, show that 


a) arg (z) = tan 


b) arg (z) = tan 


c) arg (z) = tan 


■ (*) + *• 


if x > 0, 

if x < 0, y > 0, 

if x < 0, y < 0, 
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1.36 Define the following “pseudo-ordering” of the complex numbers: we say z x < z 2 
if we have either 

0 kil < k 2 1 or ii) | z x \ = \z 2 \ and arg (z x ) < arg (z 2 ). 

Which of Axioms 6, 7, 8, 9 are satisfied by this relation? 

1.37 Which of Axioms 6, 7, 8, 9 are satisfied if. the pseudo-ordering is defined as follows? 
We say (x u y t ) < ( x 2 , y 2 ) if we have either 

i) Xj < x 2 or ii) = x 2 and yj < y 2 . 

1.38 State and prove a theorem analogous to Theorem 1.48, expressing arg (zjzj in 
terms of arg (zj) and arg (z 2 ). 

1.39 State and prove a theorem analogous to Theorem 1.54, expressing Log(z 1 /z 2 ) in 
terms of Log (z x ) and Log (z 2 ). 

1.40 Prove that the wth roots of 1 (also called the wth roots of unity) are given by a, 
a 2 , . . . , a", where a = e 2niln , and show that the roots ^ 1 satisfy the equation 

1 4- x + x 2 + • • • 4- jc"” 1 = 0. 

1.41 a) Prove that |z*| < e n for all complex z ^ 0. 

b) Prove that there is no constant M > 0 such that |cos z | < M for all complex z. 

1.42 If w = u + ii; (w, i; real), show that 

2 W = g“l°8UI“*>arg(z)£i[i;log|z|+iiarg(z)] 

1.43 a) Prove that Log (z w ) = w Log z 4- Inin, where n is an integer, 
b) Prove that (z w ) a = z wa e 2nim , where n is an integer. 

1.44 i) If 0 and a are real numbers, -n < 0 < + 7 r, prove that 

(cos 0+i sin 0) a = cos (aO) + / sin (aO). 

ii) Show that, in general, the restriction -n < 0 < + 7 ris necessary in (i) by taking 
0 = —n, a = 

iii) # is an integer, show that the formula in (i) holds without any restriction on 0. 
In this case it is known as DeMoivre’s theorem. 

1.45 Use DeMoivre’s theorem (Exercise 1.44) to derive the trigonometric identities 

sin 30 = 3 cos 2 0 sin 0 — sin 3 0, 
cos 30 = cos 3 0 — 3 cos 0 sin 2 0, 
valid for real 0. Are these valid when 0 is complex? 

1.46 Define tan z = (sin z)/(cos z) and show that for z = x + />, we have 

. sin 2x + i sinh 2y 

tan z = . 

cos lx + cosh 2 y 

1.47 Let w be a given complex number. If w ^ ±1, show that there exist two values of 
z = x + iy satisfying the conditions cos z = w and - n < x < + n. Find these values 
when w — i and- when w = 2. 
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1.48 Prove Lagrange’s identity for complex numbers: 


,2 

yi a k^k 


k= 1 


n 


n 


= 53 l a *l 2 53 I **' 2 - 53 “ °AI 2 - 

k=l k = 1 1 £k<j£n 

Use this to deduce a Cauchy-Schwarz inequality for complex numbers. 
1.49 a) By equating imaginary parts in DeMoivre’s formula prove that 

sin nd = sin" 9 cot" -1 9 - Q cot"" 3 9 + Q cot"- 5 9 

b) If 0 < 0 < n/2, prove that 

sin (2m + 1)0 = sin 2m+1 0/» m (cot 2 9) 


+ 


PJ.x) = ( 2m j + - ( 2m 3 + ^ + ( 2, ” 5 + 


JC W_2 - + 


where P m is the polynomial of degree m given by 

2 m + 1^ y,,-! + ^2m + 

Use this to show that P m has zeros at the m distinct points x k = cot 2 {nkl(2m 4- 1)} 
for k = 1,2, ... , m. 

c) Show that the sum of the zeros of P m is given by 

k2 nk _ m(2m — 1) 
k=1 2m +1 3 


m 


X>t ; 


and that the sum of their squares is given by 

a nk m(2m — 1)(4 m 2 + 10m — 9) 

> cot = — . 

j~i 2m +1 45 

NOTE. These identities can be used to prove that x n~ 2 = 7r 2 /6 andS" =1 n~* = 7r 4 /90. 
(See Exercises 8.46 and 8.47.) 

1.50 Prove that z" - 1 = JlZ=i ( z “ e 2nik/n ) for all complex z. Use this to derive the 
formula 

"- 1 . kn 


n 

k= 1 


sin — = 
n 


n 

m-i 


for n > 2. 
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CHAPTER 2 


SOME BASIC NOTIONS 
OF SET THEORY 


2.1 INTRODUCTION 

In discussing any branch of mathematics it is helpful to use the notation and 
terminology of set theory. This subject, which was developed by Boole and Cantor 
in the latter part of the 19th century, has had a profound influence on the develop- 
ment of mathematics in the 20th century. It has unified many seemingly discon- 
nected ideas and has helped reduce many mathematical concepts to their logical 
foundations in an elegant and systematic way. 

We shall not attempt a systematic treatment of the theory of sets but shall 
confine ourselves to a discussion of some of the more basic concepts. The reader 
who wishes to explore the subject further can consult the references at the end of 
this chapter. 

A collection of objects viewed as a single entity will be referred to as a set. 
The objects in the collection will be called elements or members of the set, and they 
will be said to belong to or to be contained in the set. The set, in turn, will be said 
to contain or to be composed of its elements. For the most part we shall be inter- 
ested in sets of mathematical objects; that is, sets of numbers, points, functions, 
curves, etc. However, since much of the theory of sets does not depend on the 
nature of the individual objects in the collection, we gain a great economy of 
thought by discussing sets whose elements may be objects of any kind. It is because 
of this quality of generality that the theory of sets has had such a strong effect in 
furthering the development of mathematics. 

2.2 NOTATIONS 

Sets will usually be denoted by capital letters : 

A, B, C , . . . , X, Y, Z, 

and elements by lower-case letters : a, b, c , . . . , x, y, z. We write x e S to mean 
“x is an element of S,” or ‘\x belongs to S.” If x does not belong to S, we write 
x 4 S. We sometimes designate sets by displaying the elements in braces; for 
example, the set of positive even integers less than 10 is denoted by {2, 4, 6, 8}. 
If 5 is the collection of all x which satisfy a property P, we indicate this briefly by 
writing S = {x : x satisfies P). 

From a given set we can form new sets, called subsets of the given set. For 
example, the set consisting of all positive integers less than 10 which are divisible 
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by 4, namely, {4, 8}, is a subset of the set of even integers less than 10. In general, 
we say that a set A is a subset of B, and we write A £ B whenever every element 
of A also belongs to B. The statement A £ B does not rule out the possibility 
that B £ A. In fact, we have both A £ B and B £ A if, and only if, A and B have 
the same elements. In this case we shall call the sets A and B equal and we write 
A = B. If A and B are not equal, we write A ^ B. If A £ B but A # B, then 
we say that A is a proper subset of B. 

It is convenient to consider the possibility of a set which contains no elements 
whatever; this set is called the empty set and we agree to call it a subset of every 
set. The reader may find it helpful to picture a set as a box containing certain 
objects, its elements. The empty set is then an empty box. We denote the empty 
set by the symbol 0. 


2.3 ORDERED PAIRS 

Suppose we have a set consisting of two elements a and b ; that is, the set {a, b }. 
By our definition of equality this set is the same as the set {b, a}, since no question 
of order is involved. However, it is also necessary to consider sets of two elements 
in which order is important. For example, in analytic geometry of the plane, the 
coordinates (x, y) of a point represent an ordered pair of numbers. The point (3, 4) 
is different from the point (4, 3), whereas the set {3, 4} is the same as the set {4, 3}. 
When we wish to consider a set of two elements a and b as being ordered, we shall 
enclose the elements in parentheses : (a, b). Then a is called the first element and 
b the second. It is possible to give a purely set-theoretic definition of the concept 
of an ordered pair of objects (a, b). One such definition is the following: 

Definition 2.1. (a, b) = {{a}, {a, b}}. 

This definition states that (a, b) is a set containing two elements, {a} and 
{a, b). Using this definition, we can prove the following theorem: 

Theorem 2.2. (a, b ) = (c, d) if, and only if,a = c and b = d. 

This theorem shows that Definition 2.1 is a “reasonable” definition of an 
ordered pair, in the sense that the object a has been distinguished from the object 
b. The proof of Theorem 2.2 will be an instructive exercise for the reader. (See 
Exercise 2.1.) 

2.4 CARTESIAN PRODUCT OF TWO SETS 

Definition 2.3. Given two sets A and B, the set of all ordered pairs (a, b) such that 
a g A and b e Bis called the cartesian product of A and B, and is denoted by A x B. 

Example. If R denotes the set of all real numbers, then R x R is the set of all complex 
numbers. 
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2.5 RELATIONS AND FUNCTIONS 

Let x and y denote real numbers, so that the ordered pair ( x , y) can be thought of 
as representing the rectangular coordinates of a point in the xy-plane (or a com- 
plex number). We frequently encounter such expressions as 

xy =1, x 2 + y 2 = 1 , x 2 + y 2 < 1 , x < y . (a) 

Each of these expressions defines a certain set of ordered pairs ( x , y) of real 
numbers, namely, the set of all pairs (x 9 y) for which the expression is satisfied. 
Such a set of ordered pairs is called a plane relation. The corresponding set of 
points plotted in the xy-plane is called the graph of the relation. The graphs of 
the relations described in (a) are shown in Fig. 2.1. 



Figure 2.1 

The concept of relation can be formulated quite generally so that the objects 
x and y in the pairs (x 9 y) need not be numbers but may be objects of any kind. 

Definition 2.4. Any set of ordered pairs is called a relation. 

If 5 is a relation, the set of all elements x that occur as first members of pairs 
\x 9 y) in S is called the domain of 5, denoted by 3){S). The set of second members 
y is called the range of 5, denoted by Sl(S). 

The first example shown in Fig. 2.1 is a special kind of relation known as a 
function. 

Definition 2.5. A function F is a set of ordered pairs (x 9 y) 9 no two of which have 
the same first member. That is 9 if (x 9 y)e F and (x, z) e F, then y = z. 

The definition of function requires that for every x in the domain of F there is 
exactly one y such that (x 9 y) e F. It is customary to call y the value of F at x and 
to write 

y = Fix) 

instead of (x, y) e F to indicate that the pair (x, y) is in the set F. 

As an alternative to describing a function F by specifying the pairs it contains, 
it is usually preferable to describe the domain of F, and then, for each x in the 
domain, to describe how the function value F(x) is obtained. In this connection, 
we have the following theorem whose proof is left as an exercise for the reader. 
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Theorem 2.6 . Two functions F and G are equal if and only if 

a) 0(F) = 0(G) ( F and G have the same domain ), and 

b) F(x) = G(x) for every x in 0(F). 


2.6 FURTHER TERMINOLOGY CONCERNING FUNCTIONS 

When the domain 0(F) is a subset of R, then F is called a function of one real 
variable . If 0(F) is a subset of C, the complex number system, then F is called a 
function of a complex variable . 

If 0(F) is a subset of a cartesian product A x B, then F is called a function 
of two variables . In this case we denote the function values by F(a, b ) instead of 
F((tf, 6)). A function of two real variables is one whose domain is a subset of 

R x R. 

If S is a subset of 0(F), we say that F is defined on S . In this case, the set 
of F(x) such that x e S is called the image of S under F and is denoted by F{S). If 
T is any set which contains F(S), then F is also called a mapping from S to T. 
This is often denoted by writing 

F :S T. 

If F(S) = F, the mapping is said to be onto T. A mapping of S into itself is some- 
times called a transformation . 

Consider, for example, the function of a complex variable defined by the equa- 
tion F(z) = z 2 . This function maps every sector S of the form 0 < arg (z) < 
a < rc/2 of the complex z-plane onto a sector F(S) described by the inequalities 
0 < arg [F(z)] < 2a. (See Fig. 2.2.) 



Figure 2.2 


If two functions F and G satisfy the inclusion relation G c F, we say that G 
is a restriction of F or that F is an extension of G. In particular, if S is a subset of 
0(F) and if G is defined by the equation 

G(x) = F(x) for all x in 5, 

then we call G the restriction of F to 5. The function G consists of those pairs 
(x, F(x)) such that xeS. Its domain is S and its range is F(S). 
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2.7 ONE-TO-ONE FUNCTIONS AND INVERSES 

Definition 2.7. Let F be a function defined on S. We say F is one-to-one on S if 
and only if for every x and y in S , 

F{x) — F{y) implies x = y. 

This is the same as saying that a function which is one-to-one on S assigns 
distinct function values to distinct members of S. Such functions are also called 
injective . They are important because, as we shall presently see, they possess 
inverses. However, before stating the definition of the inverse of a function, it is 
convenient to introduce a more general notion, that of the converse of a relation. 

Definition 2.8. Given a relation S, the new relation S defined by 

S = {(a, b ) : ( b , a) e S} 

is called the converse of S. 

Thus an ordered pair (a, b ) belongs to S if, and only if, the pair ( b , a), with 
elements interchanged, belongs to S. When S is a plane relation, this simply means 
that the graph of S is the reflection of the graph of S with respect to the line 
y = x. In the relation defined by x < y 9 the converse relation is defined by y < x. 

Definition 2.9. Suppose that the relation F is a function . Consider the converse 
relation F, which may or may not be a function. If F is also a function , then F is 
called the inverse of F and is denoted by F" 1 . 

Figure 2.3(a) illustrates an example of a function Ffor which Fis not a function. 
In Fig. 2.3(b) both F and its converse are functions. 

The next theorem tells us that a function which is one-to-one on its domain 
always has an inverse. 




Figure 2.3 
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Theorem 2.10. If the function F is one-to-one on its domain , then F is also a function. 

Proof To show that P is a function, we must show that if (x, y ) e F and (x, z) e F, 
then y = z. But (x, y) e F means that (y, x) e F; that is, x = F(y). Similarly, 
(x, z) e P means that x = F(z). Thus F(y) = F(z) and, since we are assuming 
that F is one-to-one, this implies y = z. Hence, F is a function. 

note. The same argument shows that if F is one-to-one on a subset 5 of 9(F), 
then the restriction of F to S has an inverse. 


2.8 COMPOSITE FUNCTIONS 

Definition 2.11. Given two functions F and G such that 9(F) £ 9(G), we conform 
a new function, the composite G ° F of G and F, defined as follows: for every x in 
the domain of F, (G ° F)(x) = G[F(x)]. 

Since 9(F) £ 9(G), the element F(x) is in the domain of G, and therefore it 
makes sense to consider </[F(x)]. In general, it is not true that G°F = F°G. 
In fact, F° G may be meaningless unless the range of G is contained in the domain 
of F. However, the associative law, 


H o (G o F) = (H o G) o F, 

always holds whenever each side of the equation has a meaning. (Verification will 
be an interesting exercise for the reader. See Exercise 2.4.) 


2.9 SEQUENCES 

Among the important examples of functions are those defined on subsets of the 
integers. 

Definition 2.12. By a finite sequence of n terms we shall understand a function F 
whose domain is the set of numbers { 1 , 2 , ... ,ri). 

The range of F is the set { F( 1 ), F(2), F( 3), . . . , F(n)}, customarily written 
{F i , F 2 , F 3 , . . . , F„}. The elements of the range are called terms of the sequence 
and, of course, they may be arbitrary objects of any kind. 

Definition 2.13. By an infinite sequence we shall mean a function F whose domain 
is the set {1, 2, 3, . . .} of all positive integers. The range of F, that is, the set 
{F(l), F( 2), F( 3), . . .}, is also written {F, , F 2 , F 3 , . . .}, and the function value F„ 
is called the nth term of the sequence. 

For brevity, we shall occasionally use the notation {F„} to denote the infinite 
sequence whose nth term is F„. 

Let s = {$„} be an infinite sequence, and let k be a function whose domain is 
the set of positive integers and whose range is a subset of the positive integers. 
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Assume that k is “order-preserving,” that is, assume that 

k{m) < k{n\ if m < n. 

Then the composite function s o k is defined for all integers n > 1 , and for every 
such n we have 

(s o k)(n) = s Hn) . 

Such a composite function is said to be a subsequence of s. Again, for brevity, 
we often use the notation {s kin) } or {s kn } to denote the subsequence of {s n } whose 
wth term is s k{n) . 

Example. Let s — {!/«} and let k be defined by k{n) = 2". Then so k = {1/2"}. 


2.10 SIMILAR (EQUINUMEROUS) SETS 

Definition 2.14 . Two sets A and B are called similar , or equinumerous , and we write 
A ~ B, if and only if there exists a one-to-one function F whose domain is the set A 
and whose range is the set B . 

We also say that F establishes a one-to-one correspondence between the sets 
A and B . Clearly, every set A is similar to itself (take F to be the “identity” function 
for which F(x) = x for all x in A). Furthermore, if A ~ B then B ~ A, because 
if F is a one-to-one function which makes A similar to B , then F -1 will make B 

similar to A. Also, if A ~ B and if B ~ C, then A ~ C. (The proof is left to 

\ 

the reader.) 


2.11 FINITE AND INFINITE SETS 
A set S is called finite and is said to contain n elements if 

S' ~ {1, 2, . . . , ri}. 

The integer n is called the cardinal number of S. It is an easy exercise to prove 
that if {1, 2 ~ {1,2,..., m} then m = n. Therefore, the cardinal 
number of a finite set is well defined. The empty set is also considered finite. Its 
cardinal number is defined to be 0. 

Sets which are not finite are called infinite sets. The chief difference between 
the two is that an infinite set must be similar to some proper subset of itself, 
whereas a finite set cannot be similar to any proper subset of itself. (See Exercise 
2.13.) For example, the set Z + of all positive integers is similar to the proper subset 
{2, 4, 8, 16,...} consisting of powers of 2. The one-to-one function F which 
makes them similar is defined by F(x) = 2 X for each x in Z + . 
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2.12 COUNTABLE AND UNCOUNTABLE SETS 

A set 5 is said to be countably infinite if it is equinumerous with the set of all 
positive integers ; that is, if 

S~ {1,2, 3,...}. 

In this case there is a function / which establishes a one-to-one correspondence 
between the positive integers and the elements of S; hence the set S can be dis- 
played as follows : 

5 = {/(D,/(2),/(3), . . . 

Often we use subscripts and denote f(k) by a k (or by a similar notation) and we 
write S = {a u a 2 , a 3 , . . . }. The important thing here is that the correspondence 
enables us to use the positive integers as “labels” for the elements of S. A count- 
ably infinite set is said to have cardinal number K 0 (read : aleph nought). 

Definition 2.15. A set S is called countable if it is either finite or countably infinite. 
A set which is not countable is called uncountable. 

The words denumerable and nondenumerable are sometimes used in place of 
countable and uncountable. 


Theorem 2.16. Every subset of a countable set is countable. 

Proof. Let S be the given countable set and assume A £ S. If A is finite, there is 
nothing to prove, so we can assume that A is infinite (which means S is also in- 
finite). Let s = {$„} be an infinite sequence of distinct terms such that 

S {■S'Ij • • • }• 


Define a function on the positive integers as follows : 

Let &(1) be the smallest positive integer m such that s m e A. Assuming that 
&(1), k(2 ), . . . , k(n — 1) have been defined, let k(n) be the smallest positive 
integer m > k(n — 1) such that s m e A. Then k is order-preserving: m > n 
implies k(m) > k(n). Form the composite function s ° k. The domain of s o k is 
the set of positive integers and the range of s ° k is A. Furthermore, s ° k is one- 
to-one, since 

s [&(«)] = $[*(»»)], 

implies 

S k(n) = S k(m)9 

which implies k(n) = k(m), and this implies n = m. This proves the theorem. 


2.13 UNCOUNTABILITY OF THE REAL NUMBER SYSTEM 

The next theorem shows that there are infinite sets which are not countable. 

Theorem 2.17. The set of all real numbers is uncountable. 
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Proof. It suffices to show that the set of x satisfying 0 < x < 1 is uncountable. 
If the real numbers in this interval were countable, there would be a sequence 
s = {j„} whose terms would constitute the whole interval. We shall show that this 
is impossible by constructing, in the interval, a real number which is not a term 
of this sequence. Write each s„ as an infinite decimal: 

$ it ^*^11,1^11,2^11,3 • • • ) 

where each u n i is 0, 1 , . . . , or 9. Consider the real number y which has the decimal 
expansion 

y = 

where 

n, if«„ >n #i, 

[2, if = 1 . 

Then no term of the sequence {s n } can be equal to y, since y differs from s x in the 
first decimal place, differs from s 2 in the second decimal place, . . . , from s n in 
the nth decimal place. (A situation like s„ = 0.1999 . . . and y = 0.2000. . . 
cannot occur here because of the way the v n are chosen.) Since 0 < y < 1, the 
theorem is proved. 

Theorem 2.18. Let Z + denote the set of all positive integers. Then the cartesian 
product Z + x Z + is countable. 

Proof. Define a function / on Z + x Z + as follows : 

f(m, n) = 2 m 3", if (m, ri) e Z + x Z + . 

Then /is one-to-one on Z + x Z + and the range of/ is a subset of Z + . 


2.14 SET ALGEBRA 

Given two sets A t and A 2 , we define a new set, called the union of A t and A 2 , 
denoted by A t u A 2 , as follows: 

Definition 2.19. The union A 1 u A 2 is the set of those elements which belong 
either to A l or to A 2 or to both. 

This is the same as saying that A t u A 2 consists of those elements which belong 
to at least one of the sets A l9 A 2 . Since there is no question of order involved in 
this definition, the union is the same as A 2 u A t ; that is, set addition is 

commutative. The definition is also phrased in such a way that set addition is 
associative: 

A t u ( A 2 u ^4 3 ) = (A t u A 2 ) u A 3 . 
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The definition of union can be extended to any finite or infinite collection of 
sets: 

Definition 2.20. If F is an arbitrary collection of sets , then the union of all the sets 
in F is defined to be the set of those elements which belong to at least one of the sets 
in F 9 and is denoted by 

U A. 

AeF 

If F is a finite collection of sets, F = {A t , ... , A„}, we write 

n 

U a = U A k = Ay u A 2 v ■ ■ ■ u A„. 

AeF k= 1 

If F is a countable collection, F = {A t , A 2 , . . . }, we write 

00 

U A = (J A k = A 1 u A z u ••• 

AeF k= 1 

Definition 2.21 . If F is an arbitrary collection of sets , the intersection of all sets in 
F is defined to be the set of those elements which belong to every one of the sets in F 9 
and is denoted by 

n a. 

AeF 

The intersection of two sets A l and A 2 is denoted by A t n A 2 and consists 
of those elements common to both sets. If A x and A 2 have no elements in common, 
then A t n A 2 is the empty set and A l and A 2 are said to be disjoint . If F is a 
finite collection (as above), we write 

n 

(]A= n A k = A t n A 2 n ■ ■ ■ n A„, 

AeF fc= 1 

and if F is a countable collection, we write 

oo 

fM= 0 A k = Al r\ A 2 n--- 

AeF k= 1 

If the sets in the collection have no elements in common, their intersection is the 
empty set. Our definitions of union and intersection apply, of course, even when 
F is not countable. Because of the way we have defined unions and intersections, 
the commutative and associative laws are automatically satisfied. 

Definition 2.22 . The complement of A relative to B 9 denoted by B — A 9 is defined 
to be the set 

B — A = {jc : x e B 9 but x $ A}. 

Note that B — (B — A) = A whenever A c B. Also note that B — A = B if 
B n A is empty. 

The notions of union, intersection, and complement are illustrated in Fig. 2.4. 
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Figure 2.4 


Theorem 2.23 . Let F be a collection of sets . Then for any set B , we have 


and 


B ~ \J A = f] (B - A), 

AeF AeF 

B - 0 A = U (B - A). 

AeF AeF 


Proof. Let S = A, T = f^ eF (B - A). If xeB - S, then x e B, but 
x t S. Hence, it is not true that x belongs to at least one A in F; therefore x 
belongs to no A in F. Hence, for every A in F, x e B — A . But this implies 
x e T, so that B — S £ T. Reversing the steps, we obtain T s B — S, and this 
proves that B — S — T. To prove the second statement, use a similar argument. 


2.15 COUNTABLE COLLECTIONS OF COUNTABLE SETS 

Definition 2.24. If F is a collection of sets such that every two distinct sets in F are 
disjoint, then F is said to be a collection of disjoint sets. 

Theorem 2.25. If Fisa countable collection of disjoint sets, say F = {A u A 2 , . . . }, 
such that each set A n is countable, then the union (J “ = , A k is also countable. 

Proof. Let A„ = {a Un , a 2 _ n , a 3yn ...}, n= 1,2,..., and let 5 = , A k . 

Then every element x of S is in at least one of the sets in F and hence x = a„ . for 
some pair of integers (m, n). The pair (m, n) is uniquely determined by x, since 
F is a collection of disjoint sets. Hence the function / defined by f(x) = (m, n) if 
x = a m,n> x e S, has domain S. The range f(S) is a subset of Z + x Z + (where Z + 
is the set of positive integers) and hence is countable. But /is one-to-one and there- 
fore S ~ f(S), which means that S is also countable. 

Theorem 2.26. If F = {A u A 2 , . . .} is a countable collection of sets, let 
G — {B u B 2 , . . . }, where B x = A t and, for n > 1, 

B n = A n — (J A k . 

k = 1 

Then G is a collection of disjoint sets, and we have 

"i 

00 00 

U A k = U B k- 

*= 1 *=1 


Th. 2.27 
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Proof. Each set B n is constructed so that it has no elements in common with the 
earlier sets B l9 B 2 , . . . , B n _ l . Hence G is a collection of disjoint sets. Let 
^ = Ufc°=i ^ k an d B — (j£° =1 B k . We shall show that A — B. First of all, if 
x e A, then x e A k for some k. If n is the smallest such k , then x g A n but 
x $ U*=i which means that x g B n , and therefore x g B. Hence A £ B. 

Conversely, if x g B, then x g B n for some n , and therefore x g A n for this same n . 
Thus x g A and this proves that B c= A. 

Using Theorems 2.25 and 2.26, we immediately obtain 

Theorem 2.27 . If F is a countable collection of countable sets , then the union of all 
sets in F is also a countable set . 

Example 1. The set Q of all rational numbers is a countable set. 

Proof Let A n denote the set of all positive rational numbers having denominator n. 
The set of all positive rational numbers is equal to (J® =1 A k . From this it follows that 
Q is countable, since each A n is countable. 

Example 2. The set S of intervals with rational endpoints is a countable set. 

Proof Let {x l9 x 2 , . . . } denote the set of rational numbers and let A n be the set of all 
intervals whose left endpoint is x n and whose right endpoint is rational. Then A n is 
countable and S = JJ x A k . 


EXERCISES 


2.1 Prove Theorem 2.2. Hint . (, a , b) = (c, d ) means {{a}, {a, b}} = {{c}, {c, d}}. 

Now appeal to the definition of set equality. 

2.2 Let 5 be a relation and let 3i(S) be its domain. The relation S is said to be 

i) reflexive if a e Q)(S) implies (< a , a) e S, 

ii) symmetric if (a, b) e S implies (b, a) e 5, 

iii) transitive if (< a , b) e S and (b, c) e S implies (a, c) e S. 

A relation which is symmetric, reflexive, and transitive is called an equivalence relation . 
Determine which of these properties is possessed by 5, if S is the set of all pairs of real 
numbers (jc, y) such that 


a ) x < y, 

d) jc 2 + y 2 = 1, 


b ) x < y, 
e) x 2 + y 2 < 0, 


c) x < \y\, 

f) x 2 4- x = y 2 + y. 


2.3 The following functions F and G are defined for all real jc by the equations given. 
In each case where the composite function G ° F can be formed, give the domain of 
G o f and a formula (or formulas) for ( G ° F)(x). 

a) F(x) = 1 - jc, G(x) = x 2 + 2jc. 

b) F(x) = x + 5, G(x) = |jc|/jc, if jc * 0, G(0) = 1. 

2x, ifO < x < 1, fx 2 , if 0 < jc < 1, 

1, otherwise, 0, otherwise. 


c) F(x) = 
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Find F(x) if G(x ) and G[F(jc)] are given as follows: 

d) G(x) = * 3 , G[F(x)] = a: 3 - 3x 2 + 3x - 1. 

e) G(x) — 3 + x + x 2 , G[F(x)] = x 2 — 3x + 5. 

2.4 Given three functions F, G, H, what restrictions must be placed on their domains 
so that the following four composite functions can be defined? 

GoF, HoG, Ho(GoF), (//°G)°F. 

Assuming that H ° (G ° F) and (H ° G)° F can be defined, prove the associative law: 

H ° (G o F) = (H o G) o F. 

2.5 Prove the following set-theoretic identities for union and intersection : 

a ) A v (Bv C) = (A u B) u C, A n (B n C) = (A n B) n C. 

b) A n (Bu C) = (A n B) u (A n C). 

c) (A u B) n (A u C) = A u (B n C). 

d) (A u B) n (Bu C)n(Cu A) = (A n B)u (A n C) u (Bn C). 

e) A n (B - C) = (A n B) - (A n C). 

f) (A - C) n (B - C) = (A n B) - C. 

g) (A — B) u B = A if, and only if, B ^ A. 

2.6 Let f:S~* T be a function. If A and B are arbitrary subsets of S, prove that 

f(A uj)= f(A) kj f(B) and f(A n B) s f(A) n f(B). 

Generalize to arbitrary unions and intersections. 

2.7 Let / : S -*• T be a function. If Y e T, we denote by f~ i (Y ) the largest subset of S 
which / maps into Y. That is, 

f~\Y) = {r:reS and f(x) e y}. 

The set f~ 1 (Y) is called the inverse image of Y under /. Prove the following for arbitrary 
subsets X of S and Y of T. 

a) X £ /- 1 [/(*)], b) f[f-\Y)} s Y, 

c) /- 1 [Y, u Y 2 ] = f~ i (Y 1 ) yjf-\Y 2 ), 

d) /-‘(T, n Y 2 ) = /-‘(Ti) nf~\Y 2 ), 

e) f~\T — Y) = S -f~ l (Y). 

f) Generalize (c) and (d) to arbitrary unions and intersections. 

2.8 Refer to Exercise 2.7. Prove that f[f~\Y)] = Y for every subset Y of T if, and 
only if, T = f{S). 

2.9 Let / : S T be a function. Prove that the following statements are equivalent. 

a) / is one-to-one on S. 

b) f(A n B) = f(A) n f(B) for all subsets A , 2? of S. 

c) f~'[f{A)] = A for every subset A of S. 

d) For all disjoint subsets A and B of 5, the images f(A) and f(B) are disjoint. 
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e) For all subsets A and B of S with B ^ A, we have 

f(A - B) = /04) - f(B). 

2.10 Prove that if A ~ B and B ~ C, then A ~ C. 

2.11 If {1, 2, ~ {1,2,..., /w}, prove that m = n. 

2.12 If 5 is an infinite set, prove that S contains a countably infinite subset. Hint. Choose 
an element a x in S and consider S — {a x }. 

2.13 Prove that every infinite set S contains a proper subset similar to S. 

2.14 If A is a countable set and B an uncountable set, prove that B — A is similar to B . 

2.15 A real number is called algebraic if it is a root of an algebraic equation /( jc) = 0, 
where f(x) = a 0 -f a x x + • • • -f a n x n is a polynomial with integer coefficients. Prove 
that the set of all polynomials with integer coefficients is countable and deduce that the 
set of algebraic numbers is also countable. 

2.16 Let 5 be a finite set consisting of n elements and let T be the collection of all subsets 
of 5. Show that T is a finite set and find the number of elements in T. 

2.17 Let R denote the set of real numbers and let S denote the set of all real-valued func- 
tions whose domain is R. Show that S and R are not equinumerous. Hint . Assume 
s ~ R and let /be a one-to-one function such that /( R) = 5. If a e R, let g a = f(a) be 
the real-valued function in S which corresponds to the real number a. Now define h by 
the equation h{x) = 1 -f g x (x) if x e R, and show that hi S. 

2.18 Let S be the collection of all sequences whose terms are the integers 0 and 1 . Show 
that S is uncountable. 

2.19 Show that the following sets are countable: 

a) the set of circles in the complex plane having rational radii and centers with 
rational coordinates, 

b) any collection of disjoint intervals of positive length. 

2.20 Let / be a real-valued function defined for every x in the interval 0 < x < 1. 
Suppose there is a positive number M having the following property: for every choice of 
a finite number of points x u x 2i . . . , x n in the interval 0 < x < 1, the sum 

l/(*i) + * * * + /C*„)| < M. 

Let S be the set of those x in 0 < x < 1 for which f(x) ^ 0. Prove that S is countable. 

2.21 Find the fallacy in the following “proof” that the set of all intervals of positive 
length is countable. 

Let {xj, x 2 , . . . } denote the countable set of rational numbers and let / be any 
interval of positive length. Then I contains infinitely many rational points jc„, but among 
these there will be one with smallest index n . Define a function Fby means of the equation 
F(I) = «, if is the rational number with smallest index in the interval /. This function 
establishes a one-to-one correspondence between the set of all intervals and a subset of the 
positive integers. Hence the set of all intervals is countable. 

2.22 Let S denote the collection of all subsets of a given set T. Let / : S -*■ R be a real- 
valued function defined on S. The function /is called additive if f(A \j B) = f(A) + f(B) 
whenever A and B are disjoint subsets of T. If /is additive, prove that for any two subsets 
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A and B we have 

f(A u B) = f(A) + f(B - A) and /(A u B) = /(A) + /(B) - /(A n B). 

2.23 Refer to Exercise 2.22. Assume / is additive and assume also that the following 
relations hold for two particular subsets A and B of T: 

f(A u B)= f(A') + f(B') - f(A')f(B') 

f(A nB)= f(A)f(B\ f(A) + f(B) * f(T\ 

where A' = T - A, B' = T — B. Prove that these relations determine /(T), and com- 
pute the value of f(T). 


SUGGESTED REFERENCES FOR FURTHER STUDY 

2.1 Boas, R. P., A Primer of Real Functions. Carus Monograph No. 13. Wiley, New 
York, 1960. 

2.2 Fraenkel, A., Abstract Set Theory , 3rd ed. North-Holland, Amsterdam, 1965. 

2.3 Gleason, A., Fundamentals of Abstract Analysis . Addison-Wesley, Reading, 1966. 

2.4 Halmos, P. R., Naive Set Theory . Van Nostrand, New York, 1960. 

2.5 Kamke, E., Theory of Sets. F. Bagemihl, translator. Dover, New York, 1950. 

2.6 Kaplansky, I., Set Theory and Metric Spaces. Allyn and Bacon, Boston, 1972. 

2.7 Rotman, B., and Kneebone, G. T., The Theory of Sets and Transfinite Numbers. 
Elsevier, New York, 1968. 



CHAPTER 3 


ELEMENTS OF 
POINT SET TOPOLOGY 


3.1 INTRODUCTION 

A large part of the previous chapter dealt with “abstract” sets, that is, sets of 
arbitrary objects. In this chapter we specialize our sets to be sets of real numbers, 
sets of complex numbers, and more generally, sets in higher-dimensional spaces. 

In this area of study it is convenient and helpful to use geometric terminology. 
Thus, we speak about sets of points on the real line, sets of points in the plane, or 
sets of points in some higher-dimensional space. Later in this book we will study 
functions defined on point sets, and it is desirable to become acquainted with 
certain fundamental types of point sets, such as open sets, closed sets, and compact 
sets, before beginning the study of functions. The study of these sets is called 
point set topology . 

3.2 EUCLIDEAN SPACE R" 

A point in two-dimensional space is an ordered pair of real numbers (x l9 x 2 ). 
Similarly, a point in three-dimensional space is an ordered triple of real numbers 
(*i, x s)- It is just as easy to consider an ordered /7-tuple of real numbers 
(*i> x 2 ,. . . , x n ) and to refer to this as a point in //-dimensional space. 

Definition 3.7. Let n > 0 be an integer. An ordered set of n real numbers 
(*i> x 2 , . . . , x n ) is called an n-dimensional point or a vector with n components. 
Points or vectors will usually be denoted by single bold-face letters; for example , 

x = (x u x 29 . . . , x n ) or y = (y u y 2 , . . . , y„). 

The number x k is called the £th coordinate of the point x or the kih component of 
the vector x. The set of all n-dimensional points is called n-dimensional Euclidean 
space or simply n- space, and is denoted by R". 

The reader may wonder whether there is any advantage in discussing spaces of 
dimension greater than three. Actually, the language of //-space makes many 
complicated situations much easier to comprehend. The reader is probably familiar 
enough with three-dimensional vector analysis to realize the advantage of writing 
the equations of motion of a system having three degrees of freedom as a single 
vector equation rather than as three scalar equations. There is a similar advantage 
if the system has n degrees of freedom. 
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Another advantage in studying H-space for a general n is that we are able to 
deal in one stroke with many properties common to 1 -space, 2-space, 3-space, 
etc., that is, properties independent of the dimensionality of the space. 

Higher-dimensional spaces arise quite naturally in such fields as relativity, and 
statistical and quantum mechanics. In fact, even infinite-dimensional spaces are 
quite common in quantum mechanics. 

Algebraic operations on H-dimensionai points are defined as follows : 


Definition 3.2 . Let x = (x u . . . , x n ) and y = (y u . . . , y n ) be in R”. We define , 


x = y if, and only if, x x = y l9 . . . , x n = y n . 


a) Equality: 

b) Sum: 

x + y = + y l9 ... 9 x n + y n ). 

c) Multiplication by real numbers (scalars) : 


d) Difference : 

e) Zero vector or origin : 


ax — (ax u . . . , ax n ) (a real). 
x y — x + (-l)y. 


o = (0, . . . , 0). 


f) Inner product or dot product: 


n 


x-y = ^ Wk- 

k= 1 


g) Norm or length : 


n 


l|x II = (x-x) 1/2 = ( x\ 


1/2 


Jfc=l 


The norm ||x — y|| is called the distance between x and y. 

note. In the terminology of linear algebra, R" is an example of a linear space. 


Theorem 3.3. Let x and y denote points in R". Then we have: 

a) ||x|| > 0, and ||x|| = 0 if, and only if, x = 0. 

b) || ax || = \a\ ||x|| for every real a. 

c) llx - y || = ||y - x||. 

d) |x • y| < ||x|| ||y|| (Cauchy-Schwarz inequality). 

e) || x + y|| < ||x|| + ||y|| (triangle inequality). 

Proof. Statements (a), (b) and (c) are immediate from the definition, and the 
Cauchy-Schwarz inequality was proved in Theorem 1.23. Statement (e) follows 
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from (d) because 

ll* + yll 2 = 2 + y ^ 2 = Z) + 2x kyk + yl) 

k=l k=l 

= [|x|j 2 + 2x-y + ||y|| 2 < ||x|| 2 + 2||x|| ||y|| + ||y|| 2 = (||x|| + ||y||) 2 . 
note. Sometimes the triangle inequality is written in the form 

11 x - z || < II X - y|| + II y - z || . 

This follows from (e) by replacing x by x — y and y by y — z. We also have 

lllxll - ||y||| < || x — y II . 


Definition 3.4. The unit coordinate vector u k in R" is the vector whose k\h com- 
ponent is 1 and whose remaining components are zero. Thus , 


«i =(1,0,..., 0), u 2 = (0, 1, 0, ... , 0), . . ., u„ = (0, 0, . . . , 0, 1). 

If x = (x l9 . . . , x n ) then x = x t u t + • • • + x n u n and x { = x*u 1? x 2 — 
x*u 2 , . . . , x n — x • u B . The vectors u l9 . . . , u„ are also called basis vectors . 


3.3 OPEN BALLS AND OPEN SETS IN R" 

Let a be a given point in R" and let r be a given positive number. The set of all 
points x in R" such that 

II x - a|| < r, 

is called an open n-ball of radius r and center a. We denote this set by 5(a) or 
by B( a; r ). 

The ball 2?(a; r ) consists of all points whose distance from a is less than r. 
In R 1 this is simply an open interval with center at a. In R 2 it is a circular disk, 
and in R 3 it is a spherical solid with center at a and radius r. 

3.5 Definition of an interior point . Let S be a subset of R", and assume that a e S. 
Then a is called an interior point of S if there is an open n-ball with center at a, all of 
whose points belong to S. 

In other words, every interior point a of S can be surrounded by an «-ball 
B{ a) ^ s. The set of all interior points of S is called the interior of S and is 
denoted by int S. Any set containing a ball with center a is sometimes called a 
neighborhood of a. 

3.6 Definition of an open set. A set S in R” is called open if all its points are interior 
points. 

note. A set S' is openjf and only if S = int S. (See Exercise 3.9.) 
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Examples. In R 1 the simplest type of nonempty open set is an open interval. The union 
of two or more open intervals is also open. A closed interval [a, b] is not an open set 
because the endpoints a and b are not interior points of the interval. 

Examples of open sets in the plane are: the interior of a disk; the cartesian product of 
two one-dimensional open intervals. The reader should be cautioned that an open interval 
in R 1 is no longer an open set when it is considered as a subset of the plane. In fact, no 
subset of R 1 (except the empty set) can be open in R 2 , because such a set cannot contain 
a 2-ball. 

In R" the empty set is open (Why?) as is the whole space R". Every open / 2 -ball 
is an open set in R". The cartesian product 

(a u by) x • • • x (a„, b„) 

of n one-dimensional open intervals (a u b x ) 9 . . . , (a n , b n ) is an open set in R" called 
an n-dimensional open interval. We denote it by (a, b), where a = (a u . . . , a n ) and 
b (^i> • • • > bn)* 

The next two theorems show how additional open sets in R" can be constructed 
from given open sets. 

Theorem 3.7 . The union of any collection of open sets is an open set. 

Proof. Let Fbe a collection of open sets and let S denote their union, S = u AgF ^ • 
Assume x e S. Then x must belong to at least one of the sets in F, say x e A. 
Since A is open, there exists an open / 2 -ball B(x) ^ A. But A ^ 5, so F(x) ^ S 
and hence x is an interior point of S. Since every point of S is an interior point, 
S is open. 

Theorem 3.8. The intersection of a finite collection of open sets is open. 

Proof Let S = (")*=! A k where each A k is open. Assume x e S. (If S is empty, 
there is nothing to prove.) Then x e A k for every k — 1,2 , . . . , m, and hence 
there is an open / 2 -ball B(x ; r k ) ^ A k . Let r be the smallest of the positive numbers 
r i-> • • • » r m- Then x e 5(x; r) ^ S. That is, x is an interior point, so S is 
open. 

Thus we see that from given open sets, new open sets can be formed by taking 
arbitrary unions or finite intersections. Arbitrary intersections, on the other hand, 
will not always lead to open sets. For example, the intersection of all open intervals 
of the form ( — 1 /«, 1 /«), where n — 1, 2, 3, . . . , is the set consisting of 0 alone. 

3.4 THE STRUCTURE OF OPEN SETS IN R 1 

In R 1 the union of a countable collection of disjoint open intervals is an open set 
and, remarkably enough, every nonempty open set in R 1 can be obtained in this 
way. This section is devoted to a proof of this statement. 

First we introduce the concept of a component interval. 
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3.9 Definition of component interval. Let S be an open subset of R 1 . An open 
interval I ( which may be finite or infinite) is called a component interval of S if 
I ^ S and if there is no open interval J ^ I such that I £ J £ S. 

In other words, a component interval of S is not a proper subset of any other 
open interval contained in S. 

Theorem 3.10. Every point of a nonempty open set S belongs to one and only one 
component interval of S. 

Proof. Assume x e S. Then x is contained in some open interval I with / c s. 
There are many such intervals but the “largest” of these will be the desired com- 
ponent interval. We leave it to the reader to verify that this largest interval is 
I x = (a(x), b(x)), where 

a ( x ) = inf {a : (a, x) £ S}, b(x) = sup {b : (x, b) £ S}. 

Here a(x) might be — oo and b(x) might be + oo. Clearly, there is no open interval 
J such that I x £ J £ S, so I x is a component interval of S containing x. If J x 
is another component interval of S containing x, then the union I x \j J x is an 
open interval contained in S and containing both I x and J x . Hence, by the defi- 
nition of component interval, it follows that I x kj J x = I x and I r \j J= J so 

*x = J x m 

Theorem 3.11 (Representation theorem for open sets on the real line). Every non- 
empty open set S in R 1 is the union of a countable collection of disjoint component 
intervals of S . 

Proof If x e S, let I x denote the component interval of S containing x . The union 
of all such intervals I x is clearly S . If two of them, I x and I y , have a point in 
common, then their union I x kj I y is an open interval contained in S and containing 
both I x and I r Hence I x kj I y = I x and I x v I y = I y so I x = I y . Therefore the 
intervals I x form a disjoint collection. 

It remains to show that they form a countable collection. For this purpose, 
let {*!, x 2 , * 3 , • • • } denote the countable set of rational numbers. In each com- 
ponent interval I x there will be infinitely many x n9 but among these there will be 
exactly one with smallest index n. We then define a function F by means of the 
equation F(I X ) = n 9 if x n is the rational number in I x with smallest index n. This 
function F is one-to-one since F(I X ) = F(I y ) = n implies that I x and I y have x n in 
common and this implies I x — I y . Therefore F establishes a one-to-one corre- 
spondence between the intervals I x and a subset of the positive integers. This 
completes the proof. 

note. This representation of S is unique. In fact, if S is a union of disjoint open 
intervals, then these intervals must be the component intervals of 5. This is an 
immediate consequence of Theorem 3.10. 

If S is an open interval, then the representation contains only one component 
interval, namely S itself. Therefore an open interval in R 1 cannot be expressed as 
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the union of two nonempty disjoint open sets. This property is also described by 
saying that an open interval is connected . The concept of connectedness for sets 
in R" will be discussed further in Section 4.16. 

3.5 CLOSED SETS 

3.12 Definition of a closed set, A set S in R" is called closed if its complement 
R n — S is open. 

Examples. A closed interval [a, b] in R 1 is a closed set. The cartesian product 

[«i, M X • • x [a n ,b n ] 

of n one-dimensional closed intervals is a closed set in R" called an n-dimensional closed 
interval [a, b]. 

The next theorem, a consequence of Theorems 3.7 and 3.8, shows how to 
construct further closed sets from given ones. 

Theorem 3,13, The union of a finite collection of closed sets is closed , and the 
intersection of an arbitrary collection of closed sets is closed. 

A further relation between open and closed sets is described by the following 
theorem. 

Theorem 3.14. If A is open and B is closed , then A — B is open and B — A is 
closed. 

Proof. We simply note that A - B = A n (R" - 5), the intersection of two 
open sets, and that B — A = B n (R n — A), the intersection of two closed sets. 


3.6 ADHERENT POINTS. ACCUMULATION POINTS 

Closed sets can also be described in terms of adherent points and accumulation 
points. 

3.15 Definition of an adherent point. Let S be a subset of R", and x a point in R", 
x not necessarily in S. Then x is said to be adherent to S if every n-ball B(\) contains 
at least one point of S. 

Examples 

1. If x 6 S, then x adheres to S for the trivial reason that every fl-ball 2?(x) contains x. 

2. If S is a subset of R which is bounded above, then sup S is adherent to S. 

Some points adhere to S because every ball 5(x) contains points of S distinct 
from x. These are called accumulation points. 

3.16 Definition of an accumulation point. If S ^ R" and x e R", then x is called 
an accumulation point of S if every n-ball B(x) contains at least one point of S 
distinct from x. 
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In other words, x is an accumulation point of S if, and only if, x adheres to 

5 — {x}. If x g S but x is not an accumulation point of S', then x is called an 
isolated point of S. 

Examples 

1. The set of numbers of the form 1/n, n = 1, 2, 3, . . . , has 0 as an accumulation point. 

2. The set of rational numbers has every real number as an accumulation point. 

3. Every point of the closed interval [a, b] is an accumulation point of the set of num- 
bers in the open interval (a, b). 

Theorem 3.17 . If x is an accumulation point of S , then every n-ball B(x) contains 
infinitely many points of S. 

Proof Assume the contrary; that is, suppose an n-ball B(x) exists which contains 
only a finite number of points of S distinct from x, say a 1? a 2 , . . . , a m . If r denotes 
the smallest of the positive numbers 

II x — aj, II x a 2 ||, ..., || x - aj, 

then B(x; r/2) will be an n-ball about x which contains no points of S distinct 
from x. This is a contradiction. 

This theorem implies, in particular, that a set cannot have an accumulation 
point unless it contains infinitely many points to begin with. The converse, how- 
ever, is not true in general. For example, the set of integers {1, 2, 3, ... } is an 
infinite set with no accumulation points. In a later section we will show that 
infinite sets contained in some n-ball always have an accumulation point. This is 
an important result known as the Bolzano-Weierstrass theorem. 

3.7 CLOSED SETS AND ADHERENT POINTS 

A closed set was defined to be the complement of an open set. The next theorem 
describes closed sets in another way. 

Theorem 3.18 . A set S in R" is closed if and only if it contains all its adherent 
points. 

Proof Assume S is closed and let x be adherent to S. We wish to prove that x e S. 
We assume x £ S and obtain a contradiction. If x$ S then x e R" - S and, since 
R n — S i s open, some n-ball B(x) lies in R" — S. Thus B(x) contains no points of 
S, contradicting the fact that x adheres to S. 

To prove the converse, we assume S contains all its adherent points and show 
that S is closed. Assume x e R" — S. Then x ^ 5, so x does not adhere to S. 
Hence some ball B(x) does not intersect 5, so B(x) c R" - S. Therefore R n - S 
is open, and hence S is closed. 

3.19 Definition of closure. The set of all adherent points of a set S is called the 
closure of S and is denoted by S. 
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For any set we have S £ S since every point of S adheres to S. Theorem 3.18 
shows that the opposite inclusion S £ S holds if and only if S is closed. Therefore 
we have : 

Theorem 3.20 . A set S is closed if and only if S = S. 

3.21 Definition of derived set. The set of all accumulation points of a set S is 
called the derived set of S and is denoted by S'. 

Clearly, we have S = 5 u S' for any set S. Hence Theorem 3.20 implies that 
S is closed if and only if S' £ S. In other words, we have : 

Theorem 3.22. A set S in R" is closed if and only if it contains all its accumulation 
points. 

3.8 THE BOLZANO-WEIERSTRASS THEOREM 

3.23 Definition of a bounded set. A set S in R" is said to be bounded if it lies entirely 
within an n-ball B(a; r) for some r > 0 and some a in R". 

Theorem 3.24 (Bolzano- Weierstrass). If a bounded set S in R" contains infinitely 
many points , then there is at least one point in R" which is an accumulation point of S. 

Proof To help fix the ideas we give the proof first for R 1 . Since S is bounded, 
it lies in some interval [ — 0 , a]. At least one of the subintervals ['—a, 0] or [0, a] 
contains an infinite subset of S. Call one such subinterval [a u fe t ]. Bisect [a u bf\ 
and obtain a subinterval [ a 2 , bf\ containing an infinite subset of 5, and continue 
this process. In this way a countable collection of intervals is obtained, the «th 
interval [a n , bf\ being of length b n — a n = tf/2"” 1 . Clearly, the sup of the left 
endpoints a n and the inf of the right endpoints b n must be equal, say to x. [Why 
are they equal?] The point x will be an accumulation point of S because, if r is 
any positive number, the interval [ a n , bf\ will be contained in B(x ; r) as soon as n 
is large enough so that b n — a n < rjl. The interval B(x ; r) contains a point of S 
distinct from x and hence x is an accumulation point of S. This proves the theorem 
for R 1 . (Observe that the accumulation point x may or may not belong to S.) 

Next we give a proof for R", n > 1, by an extension of the ideas used in treating 
R 1 . (The reader may find it helpful to visualize the proof in R 2 by referring to 
Fig. 3.1.) 

Since S is bounded, S lies in some «-ball 5(0; a), a > 0, and therefore within 
the «-dimensional interval defined by the inequalities 

— a<x k <a (k = 1, 2, . . . , n). 


Here J x denotes the cartesian product 

= /<!> x I { 2 1) x • • • x / n (1) ; 

that is, the set of points (x u . . . , x n ), where x k s / J [ 1) and where each is a 
one-dimensional interval —a<x k <a. Each interval / J [ 1) can be bisected to 
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Figure 3.1 


form two subintervals 1^1 and /£* 2 \ defined by the inequalities 

Ik]i : — 0 < x* < 0; I^l : 0 < x k < a. 

Next, we consider all possible cartesian products of the form 

1 x liM x • • • x (a) 

where each k t = 1 or 2. There are exactly 2" such products and, of course, each 
such product is an w-dimensional interval. The union of these 2" intervals is the 
original interval J u which contains S; and hence at least one of the 2" intervals in 
(a) must contain infinitely many points of S. One of these we denote by J 2 , which 
can then be expressed as 

J 2 = /, (2) x J< 2 > x • • • x /< 2 >, 

where each I k 2) is one of the subintervals of I k l) of length a. We now proceed 
with J 2 as we did with J y , bisecting each interval 7^ 2) and arriving at an w-dimen- 
sional interval J 3 containing an infinite subset of S. If we continue the process, 
we obtain a countable collection of w-dimensional intervals J u J 2 , J 3 , . . . , where 
the with interval J m has the property that it contains an infinite subset of S and 
can be expressed in the form 

J m = I\ m) x /< m > x • • • x where I ( k m) <= li l) . 


Writing 
we have 


n m) = i4 m \ bn 


b \ T - ^ (k = 1, 2, • • • , w). 

For each fixed k, the sup of all left endpoints a[ m \ (m = 1,2,...), must therefore 
be equal to the- inf of all right endpoints (m = 1,2,...), and their common 
value we denote by t k . We how assert that the point t = (t t , t 2 , , t„) is an 
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accumulation point of S. To see this, take any n-ball B(t; r). The point t, of 
course, belongs to each of the intervals J i 9 J 2 >--. constructed above, and when 
m is such that aj 2 m ~ 2 < r/2, this neighborhood will include J m . But since J m 
contains infinitely many points of S, so will B(t; r), which proves that t is indeed 
an accumulation point of S. 


3.9 THE CANTOR INTERSECTION THEOREM 

As an application of the Bolzano-Weierstrass theorem we prove the Cantor 
intersection theorem. 

Theorem 3.25. Let {Q l9 Q 2 , . . . } be a countable collection of nonempty sets in R" 
such that: 

0 Qk + 1 — Qk (fc = 1, 2, 3, ... ). 

ii) Each set Q k is closed and Q x is bounded. 

Then the intersection Q k is closed and nonempty . 

Proof Let S = Hfc°=i Q *• Then S is closed because of Theorem 3.13. To show 
that S is nonempty, we exhibit a point x in S . We can assume that each Q k con- 
tains infinitely many points; otherwise the proof is trivial. Now form a collection 
of distinct points A = {x l5 x 2 , . . . }, where x k e Q k . Since A is an infinite set 
contained in the bounded set Q u it has an accumulation point, say x. We shall 
show that x e S by verifying that x e Q k for each k . It will suffice to show that x 
is an accumulation point of each Q k , since they are all closed sets. But every 
neighborhood of x contains infinitely many points of A , and since all except 
(possibly) a finite number of the points of A belong to Q k , this neighborhood also 
contains infinitely many points of Q k . Therefore x is an accumulation point of 
Q k and the theorem is proved. 

3.10 THE LINDELOF COVERING THEOREM 

In this section we introduce the concept of a covering of a set and prove the 
Lindelof covering theorem. The usefulness of this concept will become apparent 
in some of the later work. 

3.26 Definition of a covering . A collection F of sets is said to be a covering of a 
given set S if S £= A. The collection F is also said to cover S. If F is a 
collection of open sets , then F is called an open covering of S . 

Examples 

1. The collection of all intervals of the form \jn < x < 2jn , (n = 2, 3, 4, . . . ), is an 
open covering of the interval 0 < x < 1. This is an example of a countable covering. 

2. The real line R 1 is covered by the collection of all open intervals ( a , b). This covering 
is not countable. However, it contains a countable covering of R 1 , namely, all inter- 
vals of the form («, n + 2), where n runs through the integers. 
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3. Let S = {(jc, y ) : x > 0, y > 0}. The collection F of all circular disks with centers 
at (jc, jc) and with radius jc, where jc > 0, is a covering of S. This covering is not 
countable. However, it contains a countable covering of S , namely, all those disks 
in which jc is rational. (See Exercise 3.18.) 

The Lindelof covering theorem states that every open covering of a set S in R" 
contains a countable subcollection which also covers S. The proof makes use of 
the following preliminary result: 

Theorem 3.27 Let G = {A u A 2 , . . .} denote the countable collection of all n- 
balls having rational radii and centers at points with rational coordinates. Assume 
x g R" and let S be an open set in R" which contains x. Then at least one of the 
n-balls in G contains x and is contained in S. That is, we have 

x g A k c S for some A k in G . 

Proof The collection G is countable because of Theorem 2.27. If x g R" and if S 
is an open set containing x, then there is an «-ball B(x; r) ^ s . We shall find a 
point y in S with rational coordinates that is “near” x and, using this point as 
center, will then find a neighborhood in G which lies within B(x; r) and which 
contains x. Write 

x = (x l5 x 2 , . . . , x n ), 

and let y k be a rational number such that \y k — x k \ < r/(4«) for each 
k = 1,2 ,...,«. Then 

lly - *11 < l^i - xj + • • • + | y„ - x„| < - . 

4 

Next, let q be a rational number such that r/4 < q < rjl. Then x g B(y ; q) and 
B( y; q) £ B(x; r) £ S. But B( y; q) e G and hence the theorem is proved. 
(See Fig. 3.2 for the situation in R 2 .) 



Figure 3.2 


Theorem 3.28 (Lindeldf covering theorem). Assume A <= R" and let F be an open 
covering of A. Then there is a countable subcollection of F which also covers A. 

Proof Let G = {A u A 2 , • • •} denote the countable collection of all w-balls 
having rational centers and rational radii. This set G will be used to help us extract 
a countable subcollection of F which covers A. 
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Assume x e A. Then there is an open set S in F such that x e S. By Theorem 
3.27 there is an n-ball A k in G such that x e A k c s. There are, of course, infinitely 
many such A k corresponding to each S, but we choose only one of these, for ex- 
ample, the one of smallest index, say m = m(\). Then we have x e A m(x) c s. 
The set of all n-balls A m(x) obtained as x varies over all elements of A is a countable 
collection of open sets which covers A. To get a countable subcollection of F 
which covers A, we simply correlate to each set A k(x) one of the sets S of F which 
contained A k(x) . This completes the proof. 

3.11 THE HEINE-BOREL COVERING THEOREM 

The Lindelof covering theorem states that from any open covering of an arbitrary 
set A in R" we can extract a countable covering. The Heine— Borel theorem tells 
us that if, in addition, we know that A is closed and bounded, we can reduce the 

covering to a finite covering. The proof makes use of the Cantor intersection 
theorem. 

Theorem 3.29 ( Heine-Borel). Let F be an open covering of a closed and bounded 
set A in R". Then a finite subcollection of F also covers A. 

Proof A countable subcollection of F, say {f , I 2 , . . . }, covers A, by Theorem 
3.28. Consider, for m ^ 1, the finite union 

m 

s m = U V 

k= 1 

This is open, since it is the union of open sets. We shall show that for some value 
of m the union S m covers A. 

For this purpose we consider the complement R" — S m , which is closed. 

Define a countable collection of sets {Q k , Q 2 , . . . } as follows: Q l = A, and for 
m > 1, 

Qm = A n (R" — SJ. 

That is, Q m consists of those points of A which lie outside of S m . If we can show that 
for some value of m the set Q m is empty, then we will have shown that for this m 

no point of A lies outside S m ; in other words, we will have shown that some S m 
covers A. 

Observe the following properties of the sets Q m : Each set Q m is closed, since 
it is the intersection of the closed set A and the closed set R" - S m . The sets Q m 
are decreasing, since the S m are increasing; that is, Q m+1 £ Q m . The sets Q m , 
being subsets of A, are all bounded. Therefore, if no set Q m is empty, we can apply 
the Cantor intersection theorem to conclude that the intersection ns- 1 & ^ 
also not empty. This means that there is some point in A which is in all the sets 
Qm, or, what is the same thing, outside all the sets S m . But this is impossible, since 
^ — U *= i S* . Therefore some Q m must be empty, and this completes the proof. 
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3.12 COMPACTNESS B V R" 

We have just seen that if a set S in R" is closed and bounded, then any open 
covering of S can be reduced to a finite covering. It is natural to inquire whether 
there might be sets other than closed and bounded sets which also have this 
property. Such sets will be called compact. 

3.30 Definition of a compact set. A set S in R" is said to be compact if, and only if, 

every open covering of S contains a finite subcover, that is, a finite subcollection which 
also covers S. 

The Heine-Borel theorem states that every closed and bounded set in R" is 
compact. Now we prove the converse result. 

Theorem 3.31. Let S be a subset of R". Then the following three statements are 
equivalent: 

a) S is compact. 

b) S is closed and bounded. 

c) Every infinite subset of S has an accumulation point in S. 

Proof. As noted above, (b) implies (a). If we prove that (a) implies (b), that (b) 

implies (c) and that (c) implies (b), this will establish the equivalence of all three 
statements. 

Assume (a) holds. We shall prove first that S is bounded. Choose a point p 
in S. The collection of /i-balls 2?(p; k), k = 1 , 2, .... is an open covering of S. 
By compactness a finite subcollection also covers S and hence S is bounded. 

Next we prove that S is closed. Suppose S is not closed. Then there is an 
accumulation point y of S such that y i S. If x e S, let r x = ||x - y||/2. Each r x 
is positive since y £ S and the collection {B(x; r x ):xe S} is an open covering of 
S. By compactness, a finite number of these neighborhoods cover S, say 

S <= U B(x k ; r k ). 

k=l 

Let r denote the smallest of the radii r k , r 2 , . . . , r p . Then it is easy to prove that 
the ball B( y; r) has no points in common with any of the balls B(x k ; r k ). In fact, 
if x e B( y; r), then ||x — y|| < r <, r k , and by the triangle inequality we have 
lly - x*ll < lly - x|| + llx - xj, so 

l|x - x fc || > lly - xj - ||x - y || = 2 r k - ||x - y|| > r k . 

Hence x $ B(x k ; r k ). Therefore B{ y; r) r\ S is empty, contradicting the fact that 
y is an accumulation point of S. This contradiction shows that S is closed and hence 
(a) implies (b). 

Assume (b) holds. In this case the proof of (c) is immediate, because if T is 
an infinite subset of S then T is bounded (since S is bounded), and hence by the 
Bolzano-Weierstrass theorem T has an accumulation point x, say. Now x is also 
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an accumulation point of S and hence x e S, since S is closed. Therefore (b) 
implies (c). 

Assume (c) holds. We shall prove (b). If S is unbounded, then for every 
m > 0 there exists a point x m in S with ||xj| > m. The collection T = {x l5 x 2 , . . . } 
is an infinite subset of S and hence, by (c), T has an accumulation point y in S. 
But for m > 1 + ||y|| we have 

- yll ^ Dx.ll - llyll > m - By II > 

contradicting the fact that y is an accumulation point of T. This proves that S is 
bounded. 

To complete the proof we must show that S is closed. Let x be an accumulation 
point of S. Since every neighborhood of x contains infinitely many points of S, 
we can consider the neighborhoods B(x; l Ik), where k — 1, 2, ... , and obtain a 
countable set of distinct points, say T = {xj, x 2 , . . . }, contained in S, such that 
x k e B(x; 1/A:). The point x is also an accumulation point of T. Since T is an 
infinite subset of S, part (c) of the theorem tells us that T must have an accumula- 
tion point in S. The theorem will then be proved if we show that x is the only 
accumulation point of T. 

To do this, suppose that y ¥= x. Then by the triangle inequality we have 

||y - x|| ^ ||y - xj + || x k - x|| < ||y - xj + 1/A:, if x k e T. 

If k 0 is taken so large that l/k < i||y - x|| whenever k > k 0 , the last inequality 
leads to i||y - x|| < ||y - xj. This shows that x k $ B(y; r) when k > k 0 , if 
r = J||y — x|| . Hence y cannot be an accumulation point of T. This completes 
the proof that (c) implies (b). 

3.13 METRIC SPACES 

The proofs of some of the theorems of this chapter depend only on a few properties 
of the distance between points and not on the fact that the points are in R". When 
these properties of distance are studied abstractly they lead to the concept of a 
metric space. 

3.32 Definition of a metric space. A metric space is a nonempty set M of objects 
( called points ) together with a function d from M x M to R (called the metric of 
the space) satisfying the following four properties for all points x, y, z in M : 

1. d(x, x) = 0. 

2. d(x, y) > 0 if x ^ y. 

3. d(x, y) = d(y, x). 

4. d(x, y) < d(x, z) + d(z, y). 

The nonnegative number d(x, y) is to be thought of as the distance from x to 
y. In these terms the intuitive meaning of properties 1 through 4 is clear. Property 
4 is called the triangle inequality. 
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We sometimes denote a metric space by (Af, d) to emphasize that both the set 
Af and the metric d play a role in the definition of a metric space. 


Examples 


1. M = R n 9 d(x 9 y) = ||x — y||. This is called the Euclidean metric. Whenever we refer 
to Euclidean space R", it will be understood that the metric is the Euclidean metric 
unless another metric is specifically mentioned. 

2. Af = C, the complex plane; d(z l9 z 2 ) = |z x - z 2 |. As a metric space, C is indistin- 
guishable from Euclidean space R 2 because it has the same points and the same metric. 


3. Af any nonempty set; d(x , y) = 0 if x = y 9 d(x , y) = 1 if x ^ y. This is called the 
discrete metric , and (Af, d) is called a discrete metric space. 

4. If (Af, d) is a metric space and if S is any nonempty subset of Af, then (5, d) is also a 
metric space with the same metric or, more precisely, with the restriction of d to 
S x S as metric. This is sometimes called the relative metric induced by d on 5, and 
S is called a metric subspace of Af. For example, the rational numbers Q with the 
metric d(x 9 y) = \x — y\ form a metric subspace of R. 

5. Af = R 2 ; d(x 9 y) = V(* x - y i) 2 + 4(x 2 - y 2 ) 2 , where x = (x l9 x 2 ) and y = 
(jVi, yi )• The metric space (Af, d) is not a metric subspace of Euclidean space R 2 
because the metric is different. 


6. Af = {(*!, x 2 ) : x\ + x\ = 1}, the unit circle in R 2 ; d(x 9 y) = the length of the 
smaller arc joining the two points x and y on the unit circle. 

7. Af = {(*!, x l9 x 3 ) : x\ + x\ + x\ = 1}, the unit sphere in R 3 ; d(x 9 y) = the length 
of the smaller arc along the great circle joining the two points x and y. 


8. M = R n ; d(x 9 y) = \x x - y t \ + • • • + 

9. Af = R"; d(x 9 y) = max {|jc x - y x 




- y n i- 

n >n|}- 


3.14 POINT SET TOPOLOGY IN METRIC SPACES 

The basic notions of point set topology can be extended to an arbitrary metric 
space (Af, d). 

If a e Af, the ball B(a ; r) with center a and radius r > 0 is defined to be the 
set of all x in Af such that 

d(x 9 a) < r. 

Sometimes we denote this ball by B M (a ; r) to emphasize the fact that its points 
come from Af. If S is a metric subspace of Af, the ball B s {a\ r) is the intersection 
of S with the ball B M {a\ r). 

Examples. In Euclidean space R 1 the ball B( 0; 1) is the open interval (—1,1). In the 
metric subspace S = [0, 1] the ball B s ( 0; 1) is the half-open interval [0, 1). 

note. The geometric appearance of a ball in R" need not be “spherical” if the 
metric is not the Euclidean metric. (See Exercise 3.27.) 

If S ^ M 9 a. point a in S is called an interior point of S if some ball B M (a ; r) 
lies entirely in S. The interior , int 5, is the set of interior points of S. A set S is 
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called open in M if all its points are interior points ; it is called closed in M if M — S 
is open in M. 

Examples. 

1. Every ball B M (a ; r) in a metric space M is open in M. 

2. In a discrete metric space M every subset S is open. In fact, if x e S, the ball B(x; \) 
consists entirely of points of S (since it contains only x ), so S is open. Therefore every 
subset of M is also closed! 

3. In the metric subspace S = [0, 1 ] of Euclidean space R 1 , every interval of the form 
[0, x) or (x, 1 ], where 0 < x < 1, is an open set in 5. These sets are not open in R 1 . 

Example 3 shows that if S is a metric subspace of M the open sets in S need 
not be open in M. The next theorem describes the relation between open sets in 
M and those in S'. 

Theorem 3.33. Let (S, d) be a metric subspace of (M, d), and let X be a subset of 
S. Then X is open in S if and only if 

X = A n S 

for some set A which is open in M. 

Proof Assume A is open in M and let X = A n S'. If x e X, then x e A so 
B M (x; r ) ^ A for some r > 0. Hence B s (x; r) = B M (x; r)nS^AnS = X 
so X is open in S'. 

Conversely, assume X is open in S'. We will show that X = A n S for some 
open set A in M. For every x in X there is a ball B s (x ; r x ) contained in X. Now 
B s (x * r x) = Bm( x 5 r x) n S 9 so if we let 

A = U b mO; r x ), 

xeX 

then A is open in M and it is easy to verify that A n S = X. 

Theorem 3.34. Let ( S , d) be a metric subspace of ( M , d) and let Y be a subset of 
S. Then Y is closed in S if, and only if,Y=BnS for some set B which is closed 
in M. 

Proof If Y = B n S, where B is closed in M\ then B = M — A where A is open 
in M so Y = S n B = S n (M — A) = S — A; hence Y is closed in S. 

Conversely, if 7 is closed in S, let X = S — Y. Then X is open in S so X = 
A n S, where A is open in M and 

Y=S-X=S - (A r\S) = S- A = Sn(M- A) = SnB, 

where B = M — A is closed in M. This completes the proof. 

If S c M, a point x in M is called an adherent point of S if every ball B M (x; r ) 
contains at least one point of S. If x adheres to S — {x} then x is called an 
accumulation point of S. The closure S of S is the set of all adherent points of S, 
and the derived set S' is the set of all accumulation points of S. Thus, S = S u S’. 
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The following theorems are valid in every metric space {M, d) and are proved 
exactly as they were for Euclideah space R". In the proofs, the Euclidean distance 
||x — y|| need only be replaced by the metric d(x, y). 

Theorem 3.35. a) The union of any collection of open sets is open, and the inter- 
section of a finite collection of open sets is open. 

b) The union of a finite collection of closed sets is closed, and the intersection of any 
collection of closed sets is closed. 

Theorem 3.36. If A is open and B is closed, then A — B is open and B — A is 
closed. 

Theorem 3.37. For any subset S of M the following statements are equivalent: 

a) S is closed in M. 

b) S contains all its adherent points. 

c) S contains all its accumulation points. 

d) S = S. 

Example. Let M — Q, the set of rational numbers, with the Euclidean metric of R 1 . 
Let S consist of all rational numbers in the open interval (a, b), where both a and b are 
irrational. Then S is a closed subset of Q. 

Our proofs of the Bolzano-Weierstrass theorem, the Cantor intersection 
theorem, and the covering theorems of Lindelof and Heine-Borel used not only the 
metric properties of Euclidean space R" but also special properties of R" not gen- 
erally valid in an arbitrary metric space (M, d). Further restrictions on M are 
required to extend these theorems to metric spaces. One of these extensions is 
outlined in Exercise 3.34. 

The next section describes compactness in an arbitrary metric space. 


3.15 COMPACT SUBSETS OF A METRIC SPACE 

Let ( M , d) be a metric space and let S be a subset of M. A collection F of open 
subsets of M is said to be an open covering of S if S £ u AeF A. 

A subset S of M is called compact if every open covering of S contains a finite 
subcover. S is called bounded if 5 £ B(a; r) for some r > 0 and some a in M. 

Theorem 338. Let S be a compact subset of a metric space M . Then: 

i) S is closed and bounded. 

ii) Every infinite subset of S has an accumulation point in S. 

Proof. To prove (i) we refer to the proof of Theorem 3.31 and use that part of the 
argument which showed that (a) implies (b). The only change is that the Euclidean 
distance ||x — y|| is to be replaced throughout by the metric d(x, y). 
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To prove (ii) we argue by contradiction. Let T be an infinite subset of S and 
assume that no point of S is an accumulation point of T. Then for each point x in 
S there is a ball B(x ) which contains no point of T (if x ^ T) or exactly one point 
of T (x itself, if x e T). As x runs through S, the union of these balls B(x) is an 
open covering of S. Since S is compact, a finite subcollection covers S and hence 
also covers T. But this is a contradiction because T is an infinite set and each ball 
contains at most one point of T. 

note. In Euclidean space R", each of properties (i) and (ii) is equivalent to com- 
pactness (Theorem 3.31). In a general metric space, property (ii) is equivalent to 
compactness (for a proof see Reference 3.4), but property (i) is not. Exercise 3.42 
gives an example of a metric space M in which certain closed and bounded subsets 
are not compact. 

Theorem 3.39. Let X be a closed subset of a compact metric space M. Then X is 
compact. 

Proof. Let F be an open covering of X, say X £ (J^ eF A. We will show that a 
finite number of the sets A cover X. Since X is closed its complement M — X is 
open, sofu {(M — X)} is an open covering of M. But M is compact, so this 
covering contains a finite subcover which we can assume includes M — X. There- 
fore 

Ms A i u • • • u A p u (M — X). 

This subcover also covers X and, since M — X contains no points of X, we can 
delete the set M — X from the subcover and still cover X. Thus X s A t u • • • u A p 
so X is compact. 


3.16 BOUNDARY OF A SET 

Definition 3.40. Let S be a subset of a metric space M. A point x in M is called a 
boundary point of S if every ball B u (x; r) contains at least one point of S and at 
least one point of M — S. The set of all boundary points of S is called the boundary 
of S and is denoted by dS. 

The reader can easily verify that 

dS = S n M - S. 

This formula shows that dS is closed in M. 

Example In R", the boundary of a ball 2J(a; r) is the set of points x such that |jx — a|| = r. 
In R 1 , the boundary of the set of rational numbers is all of R 1 . 

Further properties of metric spaces are developed in the Exercises and also in 
Chapter 4. 



Exercises 


65 


EXERCISES 

Open and closed sets in R 1 and R 2 

3.1 Prove that an open interval in R 1 is an open set and that a closed interval is a closed 
set. 

3.2 Determine all the accumulation points of the following sets in R 1 and decide whether 
the sets are open or closed (or neither). 

a) All integers. 

b) The interval ( a , b ]. 

c) All numbers of the form 1 /«, (n = 1, 2, 3, ... ). 

d) All rational numbers. 

e) All numbers of the form 2~ w + 5“ m , (m, n = 1,2,...). 

f) All numbers of the form (—1)" + (1 /m), (m, n = 1,2,...). 

g) All numbers of the form (1 /n) + (1/m), (m, n = 1,2,...). 

h) All numbers of the form (— 1)7 [1 + (1 /«)], {n — 1,2,...). 

3.3 The same as Exercise 3.2 for the following sets in R 2 : 

a) All complex z such that \z\ > 1. 

b) All complex z such that \z\ > 1. 

c) All complex numbers of the form (1 jn) + (// m), (m, n = 1,2,...). 

d) All points (jc, y) such that x 2 — y 2 < \. 

e) All points (jc, y) such that jc > 0. 

f ) All points (jc, y) such that jc > 0. 

3.4 Prove that every nonempty open set 5 in R 1 contains both rational and irrational 
numbers. 

3.5 Prove that the only sets in R 1 which are both open and closed are the empty set and 
R 1 itself. Is a similar statement true for R 2 ? 

3.6 Prove that every closed set in R 1 is the intersection of a countable collection of open 
sets. 

3.7 Prove that a nonempty, bounded closed set 5 in R 1 is either a closed interval, or that 
5 can be obtained from a closed interval by removing a countable disjoint collection of 
open intervals whose endpoints belong to S. 

Open and closed sets in R” 

3.8 Prove that open w-balls and ^-dimensional open intervals are open sets in R". 

3.9 Prove that the interior of a set in R w is open in R". 

3.10 If 5 £ R w , prove that int 5 is the union of all open subsets of R" which are contained 
in S. This is described by saying that int 5 is the largest open subset of S. 

3.11 If 5 and T are subsets of R w , prove that 

(int S) n (int T) = int (S n T\ and (int S) u (int T) £ int (S u T). 
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3.12 Let S' denote the derived set and S the closure of a set S in R”. Prove that: 

a) S' is closed in R"; that is, (S')' £ s'. 

b) If Sc t, then S' c T', c) (5 u T)' = S' u T'. 

d) (S)' = S'. e) S is closed in R”. 

f ) S is the intersection of all closed subsets of R" containing S. That is, S is the 
smallest closed set containing S. 

3.13 Let S and Fbe subsets of R”. Prove that S n T c 5 nf and that 5 n T c 5 n T 
if 5 is open. 

note. The statements in Exercises 3.9 through 3.13 are true in any metric space. 

3.14 A set S in R n is called convex if, for every pair of points x and y in S and every real 
0 satisfying 0 < 6 < 1, we have Ox + (1 - 0)y e S. Interpret this statement geometric- 
ally (in R 2 and R 3 ) and prove that: 

a) Every /7-ball in R" is convex. 

b) Every ^-dimensional open interval is convex. 

c) The interior of a convex set is convex. 

d) The closure of a convex set is convex. 

3.15 Let F be a collection of sets in R", and let S = U^ eF A and T = A. For 

each of the following statements, either give a proof or exhibit a counterexample. 

a) If x is an accumulation point of T, then x is an accumulation point of each set 
A in F. 

b) If x is an accumulation point of S , then x is an accumulation point of at least one 
set A in F. 

3.16 Prove that the set S of rational numbers in the interval (0, 1) cannot be expressed 
as the intersection of a countable collection of open sets. Hint. Write S = {x u x 2 , . . . }, 
assume S = f]kLi S k , where each S k is open, and construct a sequence {Q n } of closed 
intervals such that Q n+1 £ Q n £ S n and such that x n $ Q„. Then use the Cantor inter- 
section theorem to obtain a contradiction. 

Covering theorems in R R 

3.17 If S c R n 9 prove that the collection of isolated points of S is countable. 

3.18 Prove that the set of open disks in the xy-plane with center at (x, x) and radius 
x > 0, x rational, is a countable covering of the set {(x, y) : x > 0, y > 0}. 

3.19 The collection Fof open intervals of the form (1 /«, 2 /«), where n = 2, 3, . . . , is an 
open covering of the open interval (0, 1). Prove (without using Theorem 3.31) that no 
finite subcollection of F covers (0, 1). 

3.20 Give an example of a set S which is closed but not bounded and exhibit a countable 
open covering Fsuch that no finite subset of F covers S. 

3.21 Given a set S in R” with the property that for every x in S there is an /7-ball B(x) 
such that B(x) n S is countable. Prove that S is countable. 

3.22 Prove that a collection of disjoint open sets in R n is necessarily countable. Give an 
example of a collection of disjoint closed sets which is not countable. 
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3.23 Assume that S <= R”. A point x in R” is said to be a condensation point of S if every 
rt-ball R(x) has the property that R(x) n S is not countable. Prove that if S is not count- 
able, then there exists a point x in S such that x is a condensation point of S . 

3.24 Assume that S ^ R" and assume that S is not countable. Let T denote the set of 
condensation points of S. Prove that: 

a) S - T is countable, b) S n T is not countable, 

c) T is a closed set, d) T contains no isolated points. 

Note that Exercise 3.23 is a special case of (b). 

3.25 A set in R" is called perfect if S = S', that is, if S is a closed set which contains no 
isolated points. Prove that every uncountable closed set F in R" can be expressed in the 
form F = A u B, where A is perfect and B is countable ( Cantor-Bendixon theorem ). 

Hint . Use Exercise 3.24. 


Metric spaces 


3.26 In any metric space (Af, d), prove that the empty set 0 and the whole space M are 
both open and closed. 

3.27 Consider the following two metrics in R" : 

n 

di(x, y) = max \x t - y,|, d 2 (x, y) = Y' |* t - y t \. 

In each of the following metric spaces prove that the ball B(a; r) has the geometric 
appearance indicated: 


a) In (R 2 , dj, a square with sides parallel to the coordinate axes. 

b) In (R 2 , d 2 ), a square with diagonals parallel to the axes. 

c) A cube in (R 3 , d t ). 

d) An octahedron in (R 3 , d 2 ). 


3.28 Let d 1 and d 2 be the metrics of Exercise 3.27 and let ||x - y|| denote the usual 
Euclidean metric. Prove the following inequalities for all x and y in R": 


di(x, y) < | x - y || < d 2 (x, y) and d 2 (x, y) < Vw||x - y|| < nd^x, y). 


3.29 If (A/, d) is a metric space, define 


d\x, y) = 


d(x, y) 

1 + d(x, y) 


Prove that d' is also a metric for M. Note that 0 < d'{x, y) < 1 for all x, y in M. 

3.30 Prove that every finite subset of a metric space is closed. 

3.31In a metric space (M, d) the closed ball of radius r > 0 about a point a in M is the 
set B(a; r) = {x : d(x, a) <, r). 

a) Prove that B(a\ r) is a closed set. 

b) Give an example of a metric space in which B{a; r ) is not the closure of the open 
ball B(a; r). 
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3.32 In a metric space M, if subsets satisfy A £ S £ A, where A is the closure of A , th 
A is said to be dense in S. For example, the set Q of rational numbers is dense in R. 

A is dense in S and if S is dense in T , prove that A is dense in T. 

3.33 Refer to Exercise 3.32. A metric space M is said to be separable if there is a countat 
subset A which is dense in M. For example, R is separable because the set Q of ratioi 
numbers is a countable dense subset. Prove that every Euclidean space R k is separable. 

3.34 Refer to Exercise 3.33. Prove that the Lindelof covering theorem (Theorem 3.: 
is valid in any separable metric space. 

3.35 Refer to Exercise 3.32. If A is dense in S and if B is open in 5, prove that B £ A n 
Hint . Exercise 3.13. 

3.36 Refer to Exercise 3.32. If each of A and B is dense in S and if B is open in 5, prc 
that A n B is dense in 5. 

3.37 Given two metric spaces (S l9 d x ) and (S 2 , d 2 ), a metric p for the Cartesian prodi 
S x x S 2 can be constructed from d x and d 2 in many ways. For example, if x = (x l9 : 
and y = (y l9 y 2 ) are in S t x S 2 , let p(x , y) = d t (x l9 y x ) + d 2 (x 2 , y 2 ). Prove that p 
a metric for S x x S 2 and construct further examples. 

Compact subsets of a metric space 

Prove each of the following statements concerning an arbitrary metric space (M, d) a 
subsets S 9 T of M. 

3.38 Assume S ^ T ^ M. Then 5 is compact in (M, d) if, and only if, S is compact 
the metric subspace (T 9 d). 

3.39 If S is closed and T is compact, then S n T is compact. 

3.40 The intersection of an arbitrary collection of compact subsets of M is compact 

3.41 The union of a finite number of compact subsets of M is compact. 

3.42 Consider the metric space Q of rational numbers with the Euclidean metric of 
Let S consist of all rational numbers in the open interval (a y b) 9 where a and b are ir 
tional. Then 5 is a closed and bounded subset of Q which is not compact. 

Miscellaneous properties of the interior and the boundary 

If A and B denote arbitrary subsets of a metric space M, prove that: 

3.43 int A — M — M — A. 

3.44 int (M - A) = M - A. 

3.45 int (int A) = int A. 

3.46 a) int (f|i=i = 0”=! ( int where each A t ^ M. 

b) int S (V.F (int A ), if F is an infinite collection of subsets of M. 

c) Give an example where equality does not hold in (b). 

3.47 a) (int A) £ int (U^ 6f A). 

b) Give an example of a finite collection F in which equality does not hold in (a) 

3.48 a) int (&A) = 0 if A is open or if A is closed in M. 
b) Give an example in which int (dA) = M. 
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3.49 If int A = int B = 0 and if A is closed in M, then int (A u B) = 0. 

3.50 Give an example in which int A = int B = 0 but int (A u B) = M. 

3.51 dA = An M - A and 8A = d(M - A). 

3.52 If A n B = 0, then d(A u B) = dA u dB. 
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CHAPTER 4 


LIMITS AND 
CONTINUITY 


4.1 INTRODUCTION 

The reader is already familiar with the limit concept as introduced in elementary 
calculus where, in fact, several kinds of limits are usually presented. For example, 
the limit of a sequence of real numbers {x„}, denoted symbolically by writing 

lim x n = A, 

n~* oo 

means that for every number e > 0 there is an integer N such that 

\x n — A\ < e whenever n > N. 

This limit process conveys the intuitive idea that x n can be made arbitrarily close 
to A provided that n is sufficiently large. There is also the limit of a function, 
indicated by notation such as 

lim f(x) = A, 

x->p 

which means that for every e > 0 there is another number 8 > 0 such that 

| fix) — A\ < e whenever 0 < \x — p\ < 8. 

This conveys the idea that fix) can be made arbitrarily close to A by taking x 
sufficiently close to p. 

Applications of calculus to geometrical and physical problems in 3-space 
and to functions of several variables make it necessary to extend these concepts 
to R". It is just as easy to go one step further and introduce limits in the more 
general setting of metric spaces. This achieves a simplification in the theory by 
stripping it of unnecessary restrictions and at the same time covers nearly all the 
important aspects needed in analysis. 

First we discuss limits of sequences of points in a metric space, then we discuss 
limits of functions and the concept of continuity. 

4.2 CONVERGENT SEQUENCES IN A METRIC SPACE 

Definition 4.1. A sequence {x„} of points in a metric space ( S , d) is said to converge 
if there is a point p in S with the following property: 

For every e > 0 there is an integer N such that 

d(x„, p) < e whenever n > N. 
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We also say that {*„} converges to p and we write x n p as n oo, or simply 
x n P* If there is no such p in S , the sequence {xj is said to diverge . 

note. The definition of convergence implies that 

x n p if and only if d(x n9 p) 0. 

The convergence of the sequence {d(x n9 p)} to 0 takes place in the Euclidean metric 
space R 1 . 

Examples 

1. In Euclidean space R 1 , a sequence {*„} is called increasing if x n < x n+l for all n. If 
an increasing sequence is bounded above (that is, if x n < M for some M > 0 and 
all n) 9 then {x n } converges to the supremum of its range, sup {x l9 jc 2 , . . . }. Similarly, 
{x n } is called decreasing if x n+l < x n for all n. Every decreasing sequence which is 
bounded below converges to the infimum of its range. For example, {1/n} converges 
to 0. 

2. If {a n } and {b n } are real sequences converging to 0, then {a n 4- b n ) also converges to 0. 
If 0 < c n < for all n and if {a n } converges to 0, then {c n } also converges to 0. 
These elementary properties of sequences in R 1 can be used to simplify some of the 
proofs concerning limits in a general metric space. 

3. In the complex plane C, let z n = 1 + /i“ 2 + (2 — 1 /h)i. Then {z n } converges to 
1 + 2/ because 

d(z n , 1 + 2 i) 2 = \z n - (1 + 2i)| 2 = ^7 + -^-+Oas«-»oo, 

n n 

so d(z„, 1 + 2 1 ) - ► 0* 

Theorem 4.2. A sequence {jc n } in a metric space ( S , d) can converge to at most one 
point in S. 

Proof. Assume that x„~* p and x n -* q. We will prove that p = q. By the 
triangle inequality we have 

0 ^ d(p, q) < dip, x„) + dix n , q). 

Since dip, x n ) -* 0 and dix n , q) -*■ 0 this implies that dip, q) = 0, so p = q. 

If a sequence {x„} converges, the unique point to which it converges is called 
the limit of the sequence and is denoted by lim x„ or by lim,,.,*, x n . 

Example. In Euclidean space R 1 we have lim R ^ 00 1/n = 0. The same sequence in the 
metric subspace T = (0, 1 ] does not converge because the only candidate for the limit is 
0 and 0$ T. This example shows that the convergence or divergence of a sequence depends 
on the underlying space as well as on the metric. 

Theorem 4.3. In a metric space (5, d), assume x„ -* p and let T = {*,, x 2 , . . . } 
be the range of {xj. Then: 

a) T is bounded. 

b ) p is an adherent point of T. 
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Proof, a) Let N be the integer corresponding to e = 1 in the definition of con- 
vergence. Then every x„ with n ^ N lies in the ball B(p; 1), so every point in T 
lies in the ball B(p; r), where 

r = 1 + max {d(p, x Y ), ..., d(p, Xjy-j)}. 

Therefore T is bounded. 

b) Since every ball B(p; e) contains a point of T, p is an adherent point of T. 

note. If T is infinite, every ball B(p \ e) contains infinitely many points of T, so 
p is an accumulation point of T. 

The next theorem provides a converse to part (b). 

Theorem 4.4. Given a metric space ( S , d) and a subset T s S. If a point p in S is 
an adherent point of T, then there is a sequence {*„} of points in T which converges 
to p. 

Proof. For every integer n ;> 1 there is a point x„ in T with d(p, jc„) ^ 1 /«. 
Hence d{p, x„) -*■ 0, so x B -*■ p. 

Theorem 4.5. In a metric space ( S , d) a sequence {x„} converges to p if, and only 
if, every subsequence {jc t(B) } converges to p. 

Proof. Assume x„ -> p and consider any subsequence {x l(B) }. For every e > 0 
there is an N such that n ;> N implies d(x„, p) < e. Since {jc t(B) } is a subsequence, 
there is an integer M such that k(ri) > N for n > M. Hence n > M implies 
d(x k( B) , p) < e, which proves that x k(n) -*■ p. The converse statement holds trivially 
since {x„} is itself a subsequence. 


43 CAUCHY SEQUENCES 

If a sequence {x B } converges to a limit p, its terms must ultimately become close to 
p and hence close to each other. This property is stated more formally in the next 
theorem. 

Theorem 4.6. Assume that {*„} converges in a metric space ( S , d). Then for every 
b > 0 there is an integer N such that 

d(x„, x m ) < e whenever n > N and m > N. 

Proof. Let p = lim x n . Given e > 0, let N be such that d(x„, p) < e/2 whenever 
n > N. Then d(x m , p) < e/2 if m ^ N. If both n ;> N and m ^ N the triangle 
inequality gives us 

d(x„, x m ) < d(x„, p) + d(p,x J < | + | = e. 
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4.7 Definition of a Cauchy Sequence. A sequence {*„} in a metric space (S', d) is 
called a Cauchy sequence if it satisfies the following condition (< called the Cauchy 
condition ): 

For every e > 0 there is an integer N such that 

d{x n9 x m ) < e whenever n > N and m > N. 

Theorem 4.6 states that every convergent sequence is a Cauchy sequence. The 
converse is not true in a general metric space. For example, the sequence {1 In} is 
a Cauchy sequence in the Euclidean subspace T = (0, 1] of R 1 , but this sequence 
does not converge in T. However, the converse of Theorem 4.6 is true in every 
Euclidean space R\ 

Theorem 4.8. In Euclidean space R* every Cauchy sequence is convergent. 

Proof. Let {x n } be a Cauchy sequence in R* and let T = {x l9 x 2 , . . . } be the range 
of the sequence. If T is finite, then all except a finite number of the terms {x n } are 
equal and hence {x„} converges to this common value. 

Now suppose T is infinite. We use the Bolzano-Weierstrass theorem to show 
that T has an accumulation point p, and then we show that {x„} converges to p. 
First we need to know that T is bounded. This follows from the Cauchy condition. 
In fact, when e = 1 there is an N such that n > N implies ||x n — x N \\ < 1. This 
means that all points x n with n > N lie inside a ball of radius 1 about x N as center, 
so T lies inside a ball of radius 1 + M about 0, where M is the largest of the 
numbers \\x t \\ 9 . . . , Hx^H. Therefore, since T is a bounded infinite set it has an 
accumulation point p in R fc (by the Bolzano-Weierstrass theorem). We show next 
that {x n } converges to p. 

Given e > 0 there is an N such that ||x„ - xj| < e/2 whenever n > N and 
m > N. The ball B( p; e/2) contains a point x m with m > N. Hence if n > N we 
have 

l|x„ - Pll < ||x b - xJI + l|x m - Pll < | | = e, 

so lim x„ = p. This completes the proof. 


Examples 

1. Theorem 4.8 is often used for proving the convergence of a sequence when the limit 
is not known in advance. For example, consider the sequence in R 1 defined by 




(- D "- 1 

n 
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so I x m “ *n\ < ^ as soon a.sN > 1/e. Therefore {x n } is a Cauchy sequence and hence 
it converges to some limit. It can be shown (see Exercise 8.18) that this limit is log 2, 
a fact which is not immediately obvious. 


2. Given a real sequence {a n ) such that - a B+1 | < ±|a B+1 - a „ | for all n > 1. 
We can prove that { a „ } converges without knowing its limit. Let b„ = |a B+1 - a B |. 

ThenO < b H+1 < bJ2 so, by induction, b„ +1 < bill”. Hence b„ -* 0. Also, if/w > n 
we have 

m— 1 

a m ~~ a n ~ ^ B ~ a k) > 

k=n 

hence 


\<*m ~ <*n 



im 


-!-n) 


< 2 b n . 


This implies that {a n } is a Cauchy sequence, so {a B } converges. 


4.4 COMPLETE METRIC SPACES 

Definition 4.9. A metric space ( S , d) is called complete if every Cauchy sequence 

in S converges in S. A subset T of S is called complete if the metric subspace (T, d) 
is complete. 

Example 1. Every Euclidean space R* is complete (Theorem 4.8). In partic ular ri j s 
complete, but the subspace T — (0, 1 ] is not complete. 

Example 2. The space R" with the metric d(x, y) = max 1;SiiB |jc, - y,| is complete. 

The next theorem relates completeness with compactness. 

Theorem 4.10. In any metric space ( S , d) every compact subset T is complete. 

Proof. Let {*„} be a Cauchy sequence in T and let A = {xj, x 2 , . . . } denote the 

range of {*„}. If A is finite, then {*„} converges to one of the elements of A, hence 
{*„} converges in T. 

If A is infinite, Theorem 3.38 tells us that A has an accumulation point p in 
T since T is compact. We show next that x B -+ p. Given e > 0, choose N so that 
n > N and m > N implies d(x„, xj < e/2. The ball B(p; e/2) contains a point 
x m with m ^ N. Therefore if n > N the triangle inequality gives us 

d(x„, P) < d(x„, xj + d(x m , p) < ? + ~ = e, 

* 

so x„ -* p. Therefore every Cauchy sequence in Thas a limit in T, so Tis complete. 

4.5 LIMIT OF A FUNCTION 

In this section we consider two metric spaces (S, d s ) and (T, d T ), where d s and d T 
denote the respective metrics. Let A be a subset of S and let / : A -+ T be a 
function from A to T. 
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Definition 4.11. If p is an accumulation point of A and if be T, the notation 

lim f(x) = b, (1) 

x-p 

is defined to mean the following: 

For every e > 0 there is a 8 > 0 such that 

drifix), b) < e whenever x e A, x ± p, and d s (x,p) < 8. 

The symbol in (1) is read “the limit of /( x), as x tends to p, is b or “/(x) 

approaches b as x approaches p.” We sometimes indicate this by writing /"(x) — > b 
as x -* p. 

The definition conveys the intuitive idea that /(x) can be made arbitrarily 
close to b by taking x sufficiently close to p. (See Fig. 4.1.) We require that p be 
an accumulation point of A to make certain that there will be points x in A 

sufficiently close to p, with x ^ p. However, p need not be in the domain of f 

and b need not be in the range of f 




note. The definition can also be formulated in terms of balls. Thus, (1) holds if, 
and only if, for every ball B T (b), there is a ball B s (p) such that B^p) n A is not 
empty and such that 

f(x) e B T (b) whenever x e B s (p) n A, x ± p. 

When formulated this way, the definition is meaningful when p or b (or both) are 
in the extended real number system R* or in the extended complex number system 
C*. However, in what follows, it is to be understood that p and b are finite unless 
it is explicitly stated that they can be infinite. 

The next theorem relates limits of functions to limits of convergent sequences. 

Theorem 4.12. Assume p is an accumulation point of A and assume b e T. Then 

lim/(x) = b, 

x->p 


(2) 
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if , and only if 

lim f(x„ ) = b, (3) 

n-» oo 

/or ef^r^ sequence {*„} o / points in A — {/?} which converges to p. 

Proof If (2) holds, then for every e > 0 there is a S > 0 such that 

d T (f(x), b) < e whenever x e A and 0 < d s (x, p) < 5 . (4) 

Now take any sequence {x„} in A — {p} which converges to p . For the S in (4), 
there is an integer N such that n > N implies d s (x n , p) < <5. Therefore (4) implies 
rf r (/(jc w ), b) < e for n > i V, and hence {/(*„)} converges to b. Therefore (2) 

implies (3). 

To prove the converse we assume that (3) holds and that (2) is false and arrive 
at a contradiction. If (2) is false, then for some s > 0 and every S > 0 there is a 
point x in A (where x may depend on S ) such that 

0 < d s (x , p) < 5 but d T (f(x), b ) > e. (5) 

it 

Taking b = l In, n = 1,2, ... , this means there is a corresponding sequence of 
points {x„} in A — {p} such that 

0 < d s (x n ,p ) < 1 In but d T (f(x n ), b) > e. 

Clearly, this sequence {*„} converges to p but the sequence {/(*„)} does not con- 
verge to b, contradicting (3). 

note. Theorems 4.12 and 4.2 together show that a function cannot have two 
different limits as x -* p. 

4.6 LIMITS OF COMPLEX-VALUED FUNCTIONS 

Let (5, d) be a metric space, let A be a subset of 5, and consider two complex- 
valued functions / and g defined on A, 

f : A ->• C, g: A -> C. 

The sum f + g is defined to be the function whose value at each point x of A is 
the complex number f(x) + g(x). The difference f — g, the product f • g, and the 
quotient f/g are similarly defined. It is understood that the quotient is defined only 

at those points x for which g(x) ^ 0. 

The usual rules for calculating with limits are given in the next theorem. 

Theorem 4.13. Let f and g be complex-valued functions defined on a subset A of a 
metric space (S, d). Let p be an accumulation point of A, and assume that 


lim f(x) = a. 


lim g(x) = b. 
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Then we also have: 

a) lim^ p [/(x) ± #(x)] = a ± b, 

b) lim X ^ p f(x)g(x) = ab, 

c) lim x -. p f(x)lg(x) = aft if b * 0. 

Proof. We prove (b), leaving the other parts as exercises. Given e with 0 < e < 1, 
let s' be a second number satisfying 0 < s' < 1, which will be made to depend on 
e in a way to be described later. There is a S > 0 such that if xe A and d(x, p) < 5, 
then 

|/(x) - a\ < s' and \g{x) — b\ < s'. 

Then 

|/(x)| = | a + (f(x) - a) | < |a| + s' < |a| + 1. 

Writing f(x)g(x) — ab = f(x)g(x) — bf(x) + bf(x) — ab, we have 

\f(x)g(x) - ab\ < |/(x)| \g(x) - b\ + |6| |/(x) - a| 

< (|fl| + 1)6' + \b\s’ = £'(|fl| + |6| + 1). 

If we choose s' = s/(\a\ + |£>| + 1), we see that \f(x)g{x) — ab\ < s whenever 
x e A and d(x, p ) < 8, and this proves (b). 

4.7 LIMITS OF VECTOR-VALUED FUNCTIONS 

Again, let ( S , d) be a metric space and let A be a subset of S. Consider two vector- 
valued functions f and g defined on A, each with values in R*, 

g:A^R k . 

Quotients of vector-valued functions are not defined (if k > 2), but we can define 

the sum f + g, the product Af (if A is real) and the inner product f • g by the respec- 
tive formulas 

(f + g)W = f(*) + g(x), (Af)(x) = Af(x), (f-g)(x) = f(x) • g(x) 

for each x in A. We then have the following rules for calculating with limits of 
vector-valued functions. 

Theorem 4.14. Let p be an accumulation point of A and assume that 

lim f(x) = a, lim g(x) = b. 

X-*p X-*p 

Then we also have: 

a) lim x _ p [f(x) + g(x)] = a + b, 

b) lim x _p Af(x) = Aa for every scalar A, 

c ) lim x _p f(x).*g(x) = a*b, 

d) lim x _ p ||f(x)|| = ||a||. 
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Proof. We prove only parts (c) and (d). To prove (c) we write 
f(x) • g(x) - a-b = [f(x) - a] • [g(x) - b] + a-[g(x) - b] + b • [f(x) - a]. 
The triangle inequality and the Cauchy-Schwarz inequality give us 
0 < |f(x)*g(x) - a-b| 

£ l|f(*) - a|| ||g(x) - b|| + || a || || g(x) - b|| H- ||b|| ||f(x) - a||. 

Each term on the right tends to 0 as x -*• p, so f(x) • g(x) -*• a • b. This proves 
(c). To prove (d) note that |||f(x)|| — ||a||| < ||f(x) — a||. 

note. Let /j f n be n real-valued functions defined on A, and let f : A -» R" 

be the vector-valued function defined by the equation 

f(x) = (/,(x), / 2 (x), . . . , / B (x)) if x e A. 

Then ,f„ are called the components of f, and we also write f = (f t , . . . ,/„) 
to denote this relationship. 

If a = (aj, . . . , a„), then for each r = 1,2 b we have 

n 

I fix) - a r I < ||f(x) - all < £ 1/rW - a rl 

r= 1 

These inequalities show that lim x _ p f(x) = a if, and only if, lim X ^ p f r (x) = a r 
for each r. 

4.8 CONTINUOUS FUNCTIONS 

The definition of continuity presented in elementary calculus can be extended to 
functions from one metric space to another. 

Definition 4.15 . Let (S, d s ) and (T, d T ) be metric spaces and let f : S -+ T be a 
function from S to T. The function f is said to be continuous at a point p in S if 
for every e > 0 there is a 5 > 0 such that 

dr(f( x \ fiP)) < £ whenever d s (x, p) < 5 . 

Iff is continuous at every point of a subset A of S , we say f is continuous on A . 

This definition reflects the intuitive idea that points close to p are mapped by 
/ into points close to /(p). It can also be stated in terms of balls: A function /is 
continuous at p if and only if, for every e > 0, there is a 5 > 0 such that 

f(B s (p; dj) £ B T (f{p ) ; e). 

Here B s (p; S ) is a ball in S; its image under / must be contained in the ball 
B T (f(p ); e) in T. (See Fig. 4.2.) 

If p is an accumulation point of S , the definition of continuity implies that 

lim/(x) =/(p). 
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Image of Bs(v ; 8) 


Figure 4.2 


If p is an isolated point of S (a point of S which is not an accumulation point of 
S ), then every /defined at p will be continuous at p because for sufficiently small 5 
there is only one x satisfying d s (x 9 p) < 5 , namely x = p, and d T (f(p) 9 /(/?)) = 0. 

Theorem 4.16 . Letf : S -> T be a function from one metric space (5, d s ) to another 
(T 9 d T ) 9 and assume p e S. Then fis continuous at p if and only if for every sequence 
{*„} in S convergent to p 9 the sequence {/(x n )} in T converges to f{p); in symbols 9 

lim f(x n ) = f( lim x,,) . 

n~* oo \ n ~* oo J 

The proof of this theorem is similar to that of Theorem 4.12 and is left as an 
exercise for the reader. (The result can also be deduced from 4.12 but there is a 
minor complication in the argument due to the fact that some terms of the sequence 
{*„} could be equal to p.) 

The theorem is often described by saying that for continuous functions the 
limit symbol can be interchanged with the function symbol. Some care is needed 
in interchanging these symbols because sometimes {/(*„)} converges when {x„} 
diverges. 

Example If x„ -> x and y n -» y in a metric space ( S , d), then d{x„, y„) 4 d(x, y) 
(Exercise 4.7). The reader can verify that d is continuous on the metric space (S x S, p), 
where p is the metric of Exercise 3.37 with Si = S 2 = S. 

note. Continuity of a function / at a point p is called a local property of /because 
it depends on the behavior of / only in the immediate vicinity of p. A property of 
/ which concerns the whole domain of fis called a global property. Thus, continuity 
of / on its domain is a global property. 


4.9 CONTINUITY OF COMPOSITE FUNCTIONS 

Theorem 4.17 . Let ( S , d s ), (T, d T ), and (U, d v ) be metric spaces. Let f : S -* T 
and g :f(S) -* U be functions, and let h be the composite function defined on S by 
the equation 

h(x) = g(f(x)) for x in S. 

Iff is continuous at p and if g is continuous at f(p), then h is continuous at p. 
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Proof. Let b = f(p). Given e > 0, there is a S > 0 such that 

d v (g(y), 9( b )) < « whenever d T (y, b) < 5. 

For this 5 there is a 8' such that 

di(f(x),f(p)) < 8 whenever d s (x, p) < 8'. 

Combining these two statements and taking y = f(x), we find that 

d v (h(x), h(p)) < e whenever d s (x, p) < 8', 
so h is continuous at p. 

4.10 CONTINUOUS COMPLEX-VALUED AND VECTOR-VALUED FUNCTIONS 

Theorem 4.18. Let f and g be complex-valued functions continuous at a point p in 
a metric space ( S , d). Then f + g,f — g, and f-g are each continuous at p. The 
quotient ffg is also continuous at p if g(p) 0. 

Proof The result is trivial if p is an isolated point of S. If p is an accumulation 
point of S, we obtain the result from Theorem 4.13. 

There is, of course, a corresponding theorem for vector-valued functions, which 
is proved in the same way, using Theorem 4.14. 

Theorem 4.19. Let f and g be functions continuous at a point p in a metric space 
(S, d), and assume that f and g have values in R". Then each of the following is 
continuous at p: the sum f + g, the product Af for every real A, the inner product 
f*g, and the norm ||f||. 

Theorem 4.20. Let f u ... ,f n be n real-valued functions defined on a subset A of a 
metric space ( S , d s ), and let f = (f u . . . ,/„). Then f is continuous at a point p 
of A if and only if each of the functions f u ...,/„ is continuous at p. 

Proof. If p is an isolated point of A there is nothing to prove. If p is an accumula- 
tion point, we note that f(x) -*• f (p) as x -*■ p if and only if f k (x) -> f k (p) for each 
k = 1,2 ,...,«. 

4.11 EXAMPLES OF CONTINUOUS FUNCTIONS 

Let S = C, the complex plane. It is a trivial exercise to show that the following 
complex-valued functions are continuous on C : 

a) constant functions, defined by /(z) = c for every z in C; 

b) the identity function defined by /(z) = z for every z in C. 

Repeated application of Theorem 4. 1 8 establishes the continuity of every poly- 
nomial : 

/(z) = a 0 + a x z + a 2 z 2 + • • ■ + a n z n , 
the a t being complex numbers. 
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If S is a subset on C on which the polynomial / does not vanish, then 1 If is 
continuous on S. Therefore a rational function gif, where g and / are polynomials, 
is continuous at those points of C at which the denominator does not vanish. 

The familiar real-valued functions of elementary calculus, such as the ex- 
ponential, trigonometric, and logarithmic functions, are all continuous wherever 
they are defined. The continuity of these elementary functions justifies the common 
practice of evaluating certain limits by substituting the limiting value of the 
“independent variable” ; for example, 

lim e* = e° = 1. 

x -*0 

The continuity of the complex exponential and trigonometric functions is a 
consequence of the continuity of the corresponding real-valued functions and 
Theorem 4.20. 


4.12 CONTINUITY AND INVERSE IMAGES OF OPEN OR CLOSED SETS 

The concept of inverse image can be used to give two important global descriptions 
of continuous functions. 


4.21 Definition of inverse image. Let f : S -* T be a function from a set S to a 
set T. If Y is a subset of T, the inverse image of Y under f denoted by f~ 1 (Y), is 
defined to be the largest subset of S which f maps into Y; that is, 

f~ l (Y) = {x : x e S and f(x) e Y}. 

~x 

note. If /has an inverse function / _1 , the inverse image of Y under /is the same 
as the image of Y under / -1 , and in this case there is no ambiguity in the notation 
f~\Y). Note also that f~\A) s f~\B) if A £ B c T. 


Theorem 4.22. Let / : S — > T be a function from S to T. If X £ S and Y £ T, 
then we have: 

a) X = f~ 1 (Y) implies f(X) s Y. 

b) Y = f(X) implies X S / _1 (T). 


The proof of Theorem 4.22 is a direct translation of the definition of the sym- 
bols f~ 1 (Y) and f(X), and is left to the reader. It should be observed that; in 
general, we cannot conclude that Y = f{X) implies X = f~ 1 (Y). (See the example 
in Fig. 4.3.) 



/ 



Figure 4.3 
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Note that the statements in Theorem 4.22 can also be expressed as follows: 

£ Y, Xc I/" 1 [/{*)]. 

Note also that f~ 1 (A ui) = f~ 1 (A) u for all subsets A and B of T. 

Theorem 4.23. Let f : S —> T be a function from one metric space (S, d s ) to another 
(T, d T ). Then f is continuous on S if, and only if, for every open set Y in T, the 
inverse image f~ 1 (y) is open in S. 

Proof Let / be continuous on S, let Y be open in T, and let p be any point of 
f~\Y). We will prove that p is an interior point of f~ t (Y). Let y = f(p). Since 
Y is open we have B T (y; a) c Y for some e > 0. Since / is continuous at p, there 
is a 8 > 0 such that f(B^p; 8)) e B T (y; e). Hence, 

B S (P; S) S f-'lfiBsip; 5))] s r\B T (y, c)] <= f~\Y), 
so p is an interior point of f~ 1 ( Y). 

Conversely, assume that f~ 1 (Y) is open in S for every open subset Y in T. 
Choose p in S and let y = f(p). We will prove that /is continuous at p. For every 
e > 0, the ball B T (y; e) is open in T, so f~ 1 (B T (y; e)) is open in S. Now, 
P e/ _1 (2? r (y; e)) so there is a 8 > 0 such that B s (p\ 8) c f~ 1 (B T (y; e)). There- 
fore,/^^; 8)) £ B T (y; e) so /is continuous at p. 

Theorem 4.24. Let f : S -* T be a function from one metric space ( S , d s ) to another 
(T, d T ). Then f is continuous on S if, and only if, for every closed set Y in T, the 
inverse image f~*(Y) is closed in S. 

Proof If Y is closed in T, then T — Y is open in T and 

f~\T- Y) = S-f-\Y). 

Now apply Theorem 4.23. 

Examples. The image of an open set under a continuous mapping is not necessarily open. 
A simple counterexample is a constant function which maps all of S onto a single point 
in R 1 . Similarly, the image of a closed set under a continuous mapping need not be closed. 
For example, the real-valued function f(x) = arctan x maps R 1 onto the open interval 
(-7T/2, 7T/2). 


4.13 FUNCTIONS CONTINUOUS ON COMPACT SETS 

The next theorem shows that the continuous image of a compact set is compact. 
This is another global property of continuous functions. 

Theorem 4.25. Let f : S -* T be a function from one metric space (S, d s ) to another 
(T, d r ). If f is continuous on a compact subset X of S, then the image f(X) is a 
compact subset of T; in particular, f(X) is closed and bounded in T. 
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Proof. Let F be an open covering of /(AT), so that f(X) £ We will show 

that a finite number of the sets A cover f(X). Since /is continuous on the metric 
subspace (X, d s ) we can apply Theorem 4.23 to conclude that each set f~ 1 (A) is 
open in (X, d s ). The sets f~ l (A) form an open covering of X and, since X is 
compact, a finite number of them cover X, say X ^f~ 1 (A 1 ) u • • • ). 

Hence 

f(X) S flf-'iAJ u • • • uf~\A p )-\ = u • • • vf[f~\A p )] 

£ At v - --u A p , 

sof(X) is compact. As a corollary of Theorem 3.38, we see that f(X) is closed and 
bounded. 

Definition 4.26. A function f : 5 -+ R* is called bounded on S if there is a positive 
number M such that ||f(jc)|| ^ M for all x in S. 

Since f is bounded on 5 if and only if f(S) is a bounded subset of R k , we have 
the following corollary of Theorem 4.25. 

Theorem 4.27. Let f : S -* R k be a function from a metric space S to Euclidean 
space R*. If f is continuous on a compact subset X of S, then f is bounded on X. 

This theorem has important implications for real-valued functions. If / is 
real-valued and bounded on X, then f(X) is a bounded subset of R, so it has a 
supremum, sup /(A"), and an infimum, inf f(X). Moreover, 

inf f(X) <. f(x) < sup f(X) for every x in X. 

The next theorem shows that a continuous / actually takes on the values sup /(AT) 
and inf f(X) if X is compact. 

Theorem 4.28. Let f : S -*■ R be a real-valued function from a metric space S to 
Euclidean space R. Assume that f is continuous on a compact subset X of S. Then 
there exist points p and q in X such that 

f(p) = inf f(X) and f{q) = sup” f(X). 

note. Since f(p) < f(x) < f{q) for all x in X, the numbers f(p) and f(q) are 
called, respectively, the absolute or global minimum and maximum values of 
/on X. 

Proof. Theorem 4.25 shows that f{X) is a closed and bounded subset of R. Let 
m = inf/fAQ. Then m is adherent to f(X) and, since f(X) is closed, m e f(X). 
Therefore m = f(p) for some p in X. Similarly,/^) = sup f(X) for some q in X. 

Theorem 4.29. Letf : S -*■ T be a function from one metric space ( S , d s ) to another 
(T, d T ). Assume that f is one-to-one on S, so that the inverse function / -1 exists. 
If S is compact and iff is continuous on S,- then f~ l is continuous onf(S). 

Proof. By Theorem 4.24 (applied to f~ l ) we need only show that for every closed 
set A' in S' the image f(X) is closed in T. (Note that /(A r ) is the inverse image of 
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X under / -1 .) Since X is closed and S is compact, X is compact (by Theorem 3.39), 
so f(X) is compact (by Theorem 4.25) and hence f(X) is closed (by Theorem 3.38). 
This completes the proof. 

Example. This example shows that compactness of 5 is an essential part of Theorem 
4.29. Let S = [0, 1) with the usual metric of R 1 and consider the complex-valued function 
/ defined by 

fix) = e 2nix forO < x < 1. 

This is a one-to-one continuous mapping of the half-open interval [0, 1) onto the unit 
circle |z| = 1 in the complex plane. However,/ -1 is not continuous at the point /(0). 
For example, if =1 — \)n, the sequence {/ (x n ) } converges to/(0) but {x„} does not 
converge in S. 

4.14 TOPOLOGICAL MAPPINGS (HOMEOMORPHISMS) 

Definition 4.30. Let f : S -+ T be a function from one metric space ( S , d s ) to 
another (T, d T ). Assume also that f is one-to-one on S, so that the inverse function 
f~ x exists. Iff is continuous on S and if f~ x is continuous on f(S), then f is called 
a topological mapping or a homeomorphism, and the metric spaces ( S , d s ) and 
(f(S), d T ) are said to be homeomorphic. 

If /is a homeomorphism, then so is f~ l . Theorem 4.23 shows that a homeo- 
morphism maps open subsets of S onto open subsets of f(S). It also maps closed 
subsets of S onto closed subsets of f(S). 

A property of a set which remains invariant under every topological mapping 
is called a topological property. Thus the properties of being open, closed, or 
compact are topological properties. 

An important example of a homeomorphism is an isometry. This is a function 
f:S-*T which is one-to-one on S and which preserves the metric; that is, 

dj(f(x),f(y)) = d s (x, y) 

for all points x and y in S. If there is an isometry from (5, d s ) to (f(S), d T ) the 
two metric spaces are called isometric. 

Topological mappings are particularly important in the theory of space curves. 
For example, a simple arc is the topological image of an interval, and a simple 
closed curve is the topological image of a circle. 


4.15 BOLZANO’S THEOREM 

This section is devoted to a famous theorem of Bolzano which concerns a global 
property of real-valued functions continuous on compact intervals [ a , b~\ in R. 
If the graph of / lies above the x-axis at a and below the x-axis at b, Bolzano’s 
theorem asserts that the graph must cross the axis somewhere in between. Our 
proof will be - based on a local property of continuous functions known as the 
sign-preserving property. 
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Theorem 4.31. Let f be defined on an interval S in R. Assume that f is continuous 
at a point c in S and that f(c) # 0. Then there is a 1-ball B(c; 5) such that fix) 
has the same sign as /(c) in B(c; 8) n S. 

Proof. Assume /(c) > 0. For every e > 0 there is a 8 > 0 such that 

/(c) — e < f{x) < /(c) + £ whenever x e B(c; 8) n S. 

Take the 8 corresponding to e = /(c)/ 2 (this e is positive). Then we have 

i f(c) < fix) < | /(c) whenever x e 2?(c; 5) n S, 

so fix) has the same sign as /(c) in 5(c; S) n S. The proof is similar if /(c) < 0, 
except that we take e — — i/(c). 

Theorem 432 (Bolzano). Let f be real-valued and continuous on a compact interval 
[a, b~\ in R, and suppose that /(a) and fib) have opposite signs; that is, assume 
f( a )f(b) < 0. Then there is at least one point c in the open interval (a, b) such that 

Ac) = o. 

Proof For definiteness, assume /(a) > 0 and fib) < 0. Let 

A = {x : x e [a, and fix) > 0}. 

Then A is nonempty since a e A, and A is bounded above by b. Let c = sup A. 
Then a < c < b. We will prove that /(c) = 0. 

If /(c) ¥= 0, there is a 1-ball 2?(c; 5) in which /has the same sign as /(c). If 
/(c) > 0, there are points x > c at which fix) > 0, contradicting the definition 
of c. If /(c) < 0, then c — 5/2 is an upper bound for A, again contradicting the 
definition of c. Therefore we must have /(c) = 0. 

From Bolzano’s theorem we can easily deduce the intermediate value theorem 
for continuous functions. 

Theorem 433. Assume f is real-valued and continuous on a compact interval S in 
R. Suppose there are two points a < p in S such that /(a) # fifi). Then f takes 
every value between /(a) and fifi) in the interval (a, fi). 

Proof Let A: be a number between /(a) and fifi) and apply Bolzano’s theorem to 
the function g defined on [a, P] by the equation glx) = fix) - k. 

The intermediate value theorem, together with Theorem 4.28, implies that the 
continuous image of a compact interval S under a real-valued function is another 
compact interval, namely, 

[inf AS), sup/(S)]. 

(If/is constant -on S, this will be a degenerate interval.) The next section extends 
this property to the more general setting of metric spaces. 
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4.16 CONNECTEDNESS 

This section describes the concept of connectedness and its relation to continuity. 

Definition 4.34 . A metric space S is called disconnected if S = A u B, where A 
and B are disjoint nonempty open sets in S. We call S connected if it is not dis- 
connected. 

note. A subset X of a metric space S is called connected if, when regarded as a 
metric subspace of S', it is a connected metric space. 

Examples 

1. The metric space S = R — {0} with the usual Euclidean metric is disconnected, since 
it is the union of two disjoint nonempty open sets, the positive real numbers and the 
negative real numbers. 

2. Every open interval in R is connected. This was proved in Section 3.4 as a conse- 
quence of Theorem 3.11. 

3. The set Q of rational numbers, regarded as a metric subspace of Euclidean space R 1 , 
is disconnected. In fact, Q = A v 3, where A consists of all rational numbers 

< \/2 and B of all rational numbers > \/2. Similarly, every ball in Q is disconnected. 

4. Every metric space S contains nonempty connected subsets. In fact, for each p in S 
the set {p} is connected. 

To relate connectedness with continuity we introduce the concept of a two-valued 
function. 

i 

Definition 4.35. A real-valued function f which is continuous on a metric space S is 
said to be two-valued on S if f(S ) c {0, 1}. 

In other words, a two-valued function is a continuous function whose only 
possible values are 0 and 1 . This can be regarded as a continuous function from S 
to the metric space T = {0, 1}, where T has the discrete metric. We recall that 
every subset of a discrete metric space T is both open and closed in T. 

Theorem 4.36 A metric space S is connected if, and only if, every two-valued 
function on S is constant. 

Proof Assume 5 is connected and let /be a two- valued function on S. We must 
show that / is constant. Let A = / -1 ({0}) and B = / _1 ({1}) be the inverse 
images of the subsets {0} and {1}. Since {0} and {1} are open subsets of the 
discrete metric space {0, 1}, both A and B are open in S. Hence, S = A u B, 
where A and B are disjoint open sets. But since S’ is connected, either A is empty 
and B = S, or else B is empty and A = S. In either case, / is constant on S. 

Conversely, assume that S is disconnected, so that S = A u B, where A and 
B are disjoint nonempty open subsets of S. We will exhibit a two-valued function 
on S which is not constant. Let 
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Since A and B are nonempty,/ takes both values 0 and 1, so /is not constant. 
Also, /is continuous on S because the inverse image of every open subset of {0, 1} 
is open in S. 

Next we show that the continuous image of a connected set is connected. 

Theorem 4.37. Let f:S-*Mbea function from a metric space S to another 
metric space M. Let X be a connected subset of S. Iff is continuous on X, then 
f(X) is a connected subset of M. 

Proof Let g be a two-valued function on f(X). We will show that g is constant. 
Consider the composite function h defined on X by the equation h(x) = g(f(x)). 
Then h is continuous on X and can only take the values 0 and 1, so h is two- valued 
on X. Since X is connected, h is constant on X and this implies that g is constant 
on f(X). Therefore f(X) is connected. 

Example. Since an interval X in R 1 is connected, every continuous image f(X) is con- 
nected. If/ has real values, the image f(X) is another interval. If /has values in R", the 
image f(X) is called a curve in R". Thus, every curve in R" is connected. 

As a corollary of Theorem 4.37 we have the following extension of Bolzano’s 
theorem. 

Theorem 4.38 (Intermediate-value theorem for real continuous functions). Let f be 
real-valued and continuous on a connected subset S of R". Iff takes on two different 
values in S, say a and b, then for each real c between a and b there exists a point x 
in S such that f(x) = c. 

Proof The image f(S) is a connected subset of R 1 . Hence, f(S) is an interval 
containing a and b (see Exercise 4.38). If some value c between a and b were not 
in f(S), then f(S) would be disconnected. 

4.17 COMPONENTS OF A METRIC SPACE 

This section shows that every metric space S can be expressed in a unique way as 
a union of connected “pieces” called components. First we prove the following: 

Theorem 4.39. Let F be a collection of connected subsets of a metric space S such 
that the intersection T = f)^eF A is not empty. Then the union U = A is 
connected. 

Proof. Since 7^0, there is some t in T. Let / be a two-valued function on U. 
We will show that / is constant on U by showing that f{x) = f(t) for all x in U. 
If x e U, then x e A for some A in F. Since A is connected, / is constant on A 
and, since t e A, /(x) = f(t). 

Every point x in a metric space S belongs to at least one connected subset of 
S, namely {x}. By Theorem 4.39, the union of all the connected subsets which 
contain x is alsa connected. We call this union a component of S, and we denote it 
by U(x). Thus, U(x) is the maximal connected subset of S which contains x. 
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Theorem 4.40 . Every point of a metric space S belongs to a uniquely determined 
component of S . In other words, the components of S form a collection of disjoint 
sets whose union is S . 

Proof Two distinct components cannot contain a point x ; otherwise (by Theorem 
4.39) their union would be a larger connected set containing x. 

4.18 ARCWISE CONNECTEDNESS 

This section describes a special property, called arcwise connectedness , which is 
possessed by some (but not all) connected sets in Euclidean space R n . 

Definition 4.41 . A set S in R n is called arcwise connected if for any two points a 
and b in S there is a continuous function f : [0, 1] -* S such that 

f(0) = a and f(l) = b. 

note. Such a function is called a path from a to b. If f(0) # f(l), the image of 
[0, 1] under f is called an arc joining a and b. Thus, 5 is arcwise connected if 
every pair of distinct points in 5 can be joined by an arc lying in S . Arcwise 
connected sets are also called pathwise connected . If f(0 = tb + (1 — t)a for 
0 < t < 1 , the curve joining a and b is called a line segment . 

Examples 

1. Every convex set in R" is arcwise connected, since the line segment joining two points 
of such a set lies in the set. In particular, every «-ball is arcwise connected. 

2. The set in Fig. 4.4 (a union of two tangent closed disks) is arcwise connected. 



Figure 4.4 


3. The set in Fig. 4.5 consists of those points on the curve described by y = sin (1/x), 
0 < x < 1, along with the points on the horizontal segment — 1 < x < 0. This set 
is connected but not arcwise connected (Exercise 4.46). 
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The next theorem relates arcwise connectedness with connectedness. 

Theorem 4.42. Every arcwise connected set S in R" is connected. 

Proof. Let g be two-valued on S. We will prove that g is constant on S. Choose 
a point a in S. If x e S, join a to x by an arc T lying in S. Since T is connected, 
g is constant on T so g(\) = g( a). But since x is an arbitrary point of S, this shows 
that g is constant on S, so S is connected. 

We have already noted that there are connected sets which are not arcwise 
connected. However, the concepts are equivalent for open sets. 

Theorem 4.43. Every open connected set in R" is arcwise connected. 

Proof. Let S be an open connected set in R" and assume xe S. We will show that 
x can be joined to every point y in S by an arc lying in S. Let A denote that subset 
of S which can be so joined to x, and let 5 = S — A. Then S = A u B, where 
A and B are disjoint. We will show that A and B are both open in R". 

Assume that a e A and join a to x by an arc, say T, lying in S. Since a e S 
and S is open, there is an n-ball 5( a) £ S. Every y in 21(a) can be joined to a by 
a line segment (in S) and thence to x by T. Thus y e A if y e 5(a). That is, 
5(a) £ A, and hence A is open. 

To see that 5 is also open, assume that b e 5. Then there is an n-ball 5(b) £ S, 
since S is open. But if a point y in 5(b) could be joined to x by an arc, say F, 
lying in 5, the point b itself could also be so joined by first joining b to y (by a 
line segment in 5(b)) and then using F. But since \>£A,no point of 5(b) can be 
in A. That is, 5(b) £ 5, so 5 is open. 

Therefore we have a decomposition S = A u 5, where A and 5 are disjoint 
open sets in R". Moreover, A is not empty since xe A. Since S is connected, it 
follows that 5 must be empty, so S = A. Now A is clearly arcwise connected, 
because any two of its points can be suitably joined by first joining each of them to 
x. Therefore, S is arcwise connected and the proof is complete. 

note. A path f : [0, 1] -* S is said to be polygonal if the image of [0, 1] under f 
is the union of a finite number of line segments. The same argument used to prove 
Theorem 4.43 also shows that every open connected set in R" is polygonally con- 
nected. That is, every pair of points in the set can be joined by a polygonal arc 
lying in the set. 

Theorem 4.44. Every open set S in R" can be expressed in one and only one way as a 
countable disjoint union of open connected sets. 

Proof. By Theorem 4.40, the components of S form a collection of disjoint sets 
whose union is S. Each component T of S is open, because if x e T then there is 
an n-ball 5(x) contained in S. Since 5(x) is connected, 5(x) £ T, so T is open. 
By the Lindelof theorem (Theorem 3.28), the components of 5 form a countable 
collection, and by Theorem 4.40 the decomposition into components is unique. 

Definition 4.45. A set in R" is called a region if it is the union of an open connected 
set with some, none, or all its boundary points. If none of the boundary points are 
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included , the region is called an open region. If all the boundary points are included , 
the region is called a closed region. 

note. Some authors use the term domain instead of open region , especially in the 
complex plane. 


4.19 UNIFORM CONTINUITY 

Suppose /is defined on a metric space (S', d s ), with values in another metric space 
(T, d T ), and assume that /is continuous on a subset A of 5. Then, given any point 
p in A and any e > 0, there is a 8 > 0 (depending on p and on e) such that, if 
x e A, then 


d T (f(x), f(p )) < e whenever d s ( x, p) < 5. 

In general we cannot expect that for a fixed e the same value of 8 will serve equally 
well for every point p in A. This might happen, however. When it does, the 
function is called uniformly continuous on A. 

Definition 4.46 . Let f : S -> The a function from one metric space (S', d s ) to another 
(T, d T ). Then f is said to be uniformly continuous on a subset A of S if the following 
condition holds: 

For every e > 0 there exists a 8 > 0 ( depending only on e) such that if x e A 
and p e A then 

d T (f(x), /(/?)) < e whenever d s ( jc, p) < 8. (6) 

To emphasize the difference between continuity on A and uniform continuity 
on A we consider the following examples of real-valued functions. 

Examples 

1. Let f{x) = 1 /x for x > 0 and take A = (0, 1]. This function is continuous on A 
but not uniformly continuous on A. To prove this, let e = 10, and suppose we could 
find a (5, 0 < 8 < 1, to satisfy the condition of the definition. Taking* = 8, p = 8/ 11, 
we obtain \x - p\ < 8 and 

l/« - f(p)\ = ^ - \ = ^ > io. 

odd 

r 

Hence, for these two points we would always have \ f(x) — f{p)\ > 10, contradicting 
the definition of uniform continuity. 

2. Let fix) = x 2 if * e R 1 and take A = (0, 1 ] as above. This function is uniformly 
continuous on A. To prove this, observe that 

t 

\f(x) - f(p ) I = \x 2 - p 2 1 = |(x - p)(x + p ) I < 2\x - p\. 

If |* — p\ < S, then |/(x) — f(p)\ < 28. Hence, if e is given, we need only take 
8 = fi/2 to guarantee that | /(*) — f(p) \ < e for every pair x, p with \x — p\ < 8. 
This shows that /is uniformly continuous on A. 
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An instructive exercise is to show that the function in Example 2 is not uni- 
formly continuous on R 1 . 


4.20 UNIFORM CONTINUITY AND COMPACT SETS 

Uniform continuity on a set A implies continuity on A. (The reader should verify 
this.) The converse is also true if A is compact. 

Theorem 4.47 (Heine). Let f : S -* T be a function from one metric space (5, d s ) 
to another (T, d T ). Let A be a compact subset of S and assume that f is continuous 
on A . Then f is uniformly continuous on A. 

Proof Let e > 0 be given. Then each point a in A has associated with it a ball 
B s (a; r )> w ith r depending on a, such that 


djifixim) < e 2 


whenever x e B s {a \ r) n A. 


Consider the collection of balls B s (a; r/2) each with radius r/2. These cover A 
and, since A is compact, a finite number of them also cover A, say 

As 9, B { a ‘ ; f)- 

In any ball of twice the radius, B(a k ; r k ), we have 


d T (f(x), f(a k )) < I 


whenever x e B^a k ; r k ) n A. 


Let 5 be the smallest of the numbers rj/2, . . . , rJ2. We shall show that this 8 
works in the definition of uniform continuity. 

For this purpose, consider two points of A, say x and p with d s (x, p ) < 8. 
By the above discussion there is some ball B^a k ; rj 2) containing x, so 


d T (f(x), f(a k )) < ^ . 


By the triangle inequality we have 


ds(p, a k ) < ds(p, x) + ds(x, a k ) < 8 + 


< — + 


= r, 


Hence, p e B s (a k ; r k ) n 5, so we also have dj(f(p),f(a k )) < c/2. Using the 
triangle inequality once more we find 


d T {f(x),f(pj) < dj(f(x), f(a k j) + d-j{f(a k ), /( p)) < ~ | = c. 


This completes the proof. 
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4.21 FIXED-POINT THEOREM FOR CONTRACTIONS 

Let / : 5 -*• S be a function from a metric space (S, d) into itself. A point p in 
S is called a. fixed point of/ if f(p) = p. The function /is called a contraction of 
S if there is a positive number a < 1 (called a contraction constant), such that 

d(f(x),f(y)) ^ <*d(x, y ) for all x, y in S. (7) 

Clearly, a contraction of any metric space is uniformly continuous on S. 

Theorem 4.48 (Fixed-point theorem). A contraction f of a complete metric space S 
has a unique fixed point p. 

Proof. If p and p' are two fixed points, (7) implies d(p,p') < <xd(p, p'), so 
so d(p, p') = 0 and p = p'. Hence /has at most one fixed point. 

To prove it has one, take any point x in S and consider the sequence of iterates : 

fix), f(f(x)), ... 

That is, define a sequence {/>„} inductively as follows : 

Po = x, p n+ 1 = /(/>„), n = 0, 1, 2, . . . 

We will prove that {/>„} converges to a fixed point of /. First we show that {p n } is 
a Cauchy sequence. From (7) we have 

d(p„+i,p n ) = d(f{p n ),f(p n _ i )) < <xd(p„, p„_i), 
so, by induction, we find 

d(Pn+uPn) ^ a " d(p t , Po) = COC", 

where c = d(p t , p 0 ). Using the triangle inequality we find, for m > n, 

m— 1 m— 1 n m 

d(p m , p„) < d(p k+1 , p k ) < c ^ <x k = c LJZJL < _1_ a » 

k = n k = n 1 — a 1 — a 

Since a" -*• 0 as n -* oo, this inequality shows that {p„} is a Cauchy sequence. But 
5 is complete so there is a point p in S such that p„ -*• p. By continuity of / 

fip) = lim f(p„) = lim p n+1 = p, 

\n~* oo / n~* oo if-* oo 

so p is a fixed point of /. This completes the proof. 

Many important existence theorems in analysis are easy consequences of the 
fixed point theorem. Examples are given in Exercises 7.36 and 7.37. Reference 
4.4 gives applications to numerical analysis. 

4.22 DISCONTINUITIES OF REAL-VALUED FUNCTIONS 

The rest of this chapter is devoted to special properties of real-valued functions 
defined on subintervals of R. 
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Let / be defined on an interval (a, b). Assume c e [a, b). If /( x) -* A as 
x -* c through values greater than c, we say that A is the righthand limit of /at c 
and we indicate this by writing 

lim f(x) = A. 

X-+C + 

The righthand limit A is also denoted by f(c+). In the e, 5 terminology this means 
that for every e > 0 there is a <5 > 0 such that 

\f{x) — /(c+)| < e whenever c<x<c + 5<b. 

Note that / need not be defined at the point c itself. If / is defined at c and if 
/(c+) = /(c), we say that /is continuous from the right at c. 

Lefthand limits and continuity from the left at c are similarly defined if 
c e (a, b\ 

If a < c < b, then / is continuous at c if, and only if, 

/(c) = /(c+) — /(c ). 

We say c is a discontinuity of / if / is not continuous at c. In this case one of 
the following conditions is satisfied : 

a) Either /(c+) or f(c— ) does not exist. 

b) Both /(c+) and f(c— ) exist but have different values. 

c) Both /(c + ) and f(c— ) exist and /(c+) = f(c— ) ^ /(c). 

In case (c), the point c is called a removable discontinuity, since the discontinuity 
could be removed by redefining/ at c to have the value /(c+) = f(c— ). In cases 
(a) and (b), we call c an irremovable discontinuity because the discontinuity cannot 
be removed by redefining / at c. 

Definition 4.49. Let f be defined on a closed interval [a, b\ If /(c+) and f{c—) 
both exist at some interior point c, then: 

a ) f( c ) ~ f( c ~) is called the lefthand jump of f at c, 

b) /(c+) — /(c) is called the righthand jump of f at c, 

c) /(c+) — f{c— ) is called the jump of f at c. 

If any one of these three numbers is different from 0, then c is called a jump dis- 
continuity of f 

For the endpoints a and b, only one-sided jumps are considered, the righthand 
jump at a,f(a + ) — f(a), and the lefthand jump at b,f(b) — f(b—). 

Examples 

1. The function / defined by f{x) = xj\x\ if x ^ 0,/(0) = A, has a jump discontinuity 
at 0, regardless of the value of A. Here/(0+) = land/(0— )= —1. (See Fig. 4.6.) 

2. The function / defined by /( x) = 1 if x £ 0, /( 0) = 0, has a removable jump dis- 
continuity at 0. In this case /(0+) = /( 0— ) = 1. 
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3. The function / defined by f(x) = l/x if x ^ 0, /(0) = A , has an irremovable dis- 
continuity at 0. In this case neither /(0+) nor /( 0-) exists. (See Fig. 4.7.) 

4. The function /defined by f(x) = sin (l/x) if x ^ 0,/(0) = ^4, has an irremovable dis- 
continuity at 0 since neither /(0+) nor /( 0-) exists. (See Fig. 4.8.) 

5. The function / defined by f(x) = x sin (l/x) if x ^ 0, /(0) = 1, has a removable 
jump discontinuity at 0, since /( 0+) = /( 0-) = 0. (See Fig. 4.9.) 



Figure 4.8 



4.23 MONOTONIC FUNCTIONS 

Definition 4.50 . Let f be a real-valued function defined on a subset S of R. Then 
f is said to be increasing (or nondecreasing ) on S if for every pair of points x and y 
in S , 

x < y implies f(x) < f(y). 

Ifx < y implies f(x) < f(y), then f is said to be strictly increasing on S. (Decreasing 
functions are similarly defined .) A function is called mono tonic on S if it is increasing 
on S or decreasing on S. 

If/ is an increasing function, then — / is a decreasing function. Because of this 
simple fact, in many situations involving monotonic functions it suffices to consider 
only the case of increasing functions. 
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We shall prove that functions which are monotonic on compact intervals 
always have finite right- and lefthand limits. Hence their discontinuities (if any) 
must be jump discontinuities. 

Theorem 4.51. Iff is increasing on [a, b\ then f(c+) and f(c—) both exist for each 
c in (a, b) and we have 

/(C-) </(c) </(c+). 


A t the endpoints we have 

f(a)<f(a+) and fib-) < fib). 

Proof Let A = {fix) : a < x < c}. Since / is increasing, this set is bounded 
above by /(c). Let a = sup A. Then a < /(c) and we shall prove that /(c— ) 
exists and equals a. 

To do this we must show that for every e > 0 there is a 5 > 0 such that 

c — 8 < x < c implies |/(xr) — a| < e. 

But .since a = sup A, there is an element /(*,) of A such that a — e < f(x v ) < a. 
Since / is increasing, for every x in (*,, c) we also have a — e < fix) < ct, and 

hence \fix) — a| < e. Therefore the number 5 = c — x t has the required 

property. (The proof that /(c+) exists and is >/(c) is similar, and only trivial 
modifications are needed for the endpoints.) 

There is, of course, a corresponding theorem for decreasing functions which 
the reader can formulate for himself. 

Theorem 4.52. Let f be strictly increasing on a set S in R. Then f l exists and is 
strictly increasing on fiS). 

Proof Since / is strictly increasing it is one-to-one on S, so f~ 1 exists. To see 
that f~ 1 is strictly increasing, let y t < y 2 be two points in fiS) and let x { — f~ 1 {y l ), 
x 2 = f~ l iyf)- We cannot have x { > x 2 , for then we would also have y, > y 2 . 
The only alternative is 

*i < x 2 , 

and this means that f~ l is strictly increasing. 

Theorem 4.52, together with Theorem 4.29, now gives us: 

Theorem 4.53. Let f be strictly increasing and continuous on a compact interval 
[a, b\ Then f~ l is continuous and strictly increasing on the interval [ f{a), f{b)]. 

note. Theorem 4.53 tells us that a continuous, strictly increasing function is a 
topological mapping. Conversely, every topological mapping of an interval [a, b~\ 
onto an interval [ c , d"] must be a strictly monotonic function. The verification of 
this fact will be an instructive exercise for the reader (Exercise 4.62). 
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Limits of sequences 

4.1 Prove each of the following statements about sequences in C. 

a) z n -► 0 if \z\ < 1; {z n } diverges if \z\ > 1. 

b) If z n -► 0 and if {c„} is bounded, then {c„z„} -► 0. 

c) z n \n\ -> 0 for every complex z. 

d) If a n = y/n 2 + 2 — n, then a„ -> 0. 

4.2 If a n+2 = (a n +i + a n)l 2 for all n > 1, show that -> (a t + 2a 2 )/3. Hint. a n+2 - 

a «+l = a n+ l)- 

4.3 If 0 < x x < 1 and if x n+l = 1 — Vl - x„ for all n > 1, prove that {jt n } is a 
decreasing sequence with limit 0. Prove also that x n+l /x n -► 

4.4 Two sequences of positive integers {a n } and {b n } are defined recursively by taking 
a i — bi = 1 and equating rational and irrational parts in the equation 

a n + b„yjl = (a n _! + ^„_ 1 V2) 2 for n > 2. 

Prove that a 2 — 2Z>^ = 1 for n > 2. Deduce that -► V 2 through values > V 2, 
and that 26 n /a n -► V 2 through values < \Jl. 

4.5 A real sequence {*„} satisfies 7x n+l = x* + 6 for n > 1. If jc x = prove that the 
sequence increases and find its limit. What happens If x x = f or if x 1 = f ? 

4.6 If | a„\ < 2 and | a n+2 - a n+1 \ < i|a 2 +1 - a\\ for all n > 1, prove that {a n } 
converges. 

4.7 In a metric space (5, */), assume that -► x and y n -► y. Prove that d(x n , y n ) -► 

4.8 Prove that in a compact metric space (5, </), every sequence in S has a subsequence 
which converges in S. This property also implies that S is compact but you are not re- 
quired to prove this. (For a proof see either Reference 4.2 or 4.3.) 

4.9 Let A be a subset of a metric space S. If A is complete, prove that A is closed. Prove 
that the converse also holds if S is complete. 

Limits of functions 

note. In Exercises 4.10 through 4.28, all functions are real- valued. 

4.10 Let /be defined on an open interval (a, b) and assume x e (a, b). Consider the two 
statements 

a) lim | f(x + h) - f(x ) | = 0; b) lim | f(x + h) - /( x - h)\ = 0. 

h-> 0 h-+ 0 

Prove that (a) always implies (b), and give an example in which (b) holds but (a) does not. 

4.11 Let /be defined on R 2 . If 

lim f(x , y) = L 

and if the one-dimensional limits lim x _+ a f(x, y) and lim y -+ b f(x, y) both exist, prove that 

lim [lim/(x, y)] = lim [lim f(x, y)] = L. 
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Now consider the functions / defined on R 2 as follows : 


a) fix, y) = 


b) fix, = 


* 2 - y 2 
x 2 + y 2 


(xy) : 


ixy) 2 + (x - y) : 


if (*,30 * (0,0), /(0,0) = 0. 


if (*,30 * (0, 0), /(0, 0) = 0. 


c) fix, y) 

d) fix, y) 


= - sin ( xy ) 
x 

_ f(* + y) sin (1 /jc) sin (l/y) 
~~ 0 


( sin x — sin y 
tan x — tan y 

cos 3 x 


if * * 0, /( 0, y) = y. 

if x 0 and y ^ 0, 
if x = 0 or y = 0. 

if tan x ^ tan y, 
if tan x = tan y. 


In each of the preceding examples, determine whether the following limits exist and 
evaluate those limits that do exist : 


lim [lim fix, >0] ; lim [lim fix, ^)] ; lim fix, y). 

jc-+0 y-+0 y-+0 jc-+0 (jc,y)-»(0,0) 

4.12 If x e [0, 1 ] prove that the following limit exists, 

lim [lim cos 2 " (m\ nx )] , 

m-* oo n-*a o 


and that its value is 0 or 1, according to whether x is irrational or rational. 


Continuity of real-valued functions 

4.13 Let / be continuous on [< a , b ] and let f(x) = 0 when x is rational. Prove that 
f(x) = 0 for every x in [a, b]. 

4.14 Let / be continuous at the point a = (a u a 2 , . . . , a n ) in R". Keep a 2 , a 3 , . . . , a n 
fixed and define a new function g of one real variable by the equation 

9(x) = fix, a n ). 

Prove that g is continuous at the point x = a x . (This is sometimes stated as follows: 
A continuous function of n variables is continuous in each variable separately .) 

4.15 Show by an example that the converse of the statement in Exercise 4.14 is not true 
in general. 

4.16 Let f g, and h be defined on [0, 1 ] as follows: 

f(x) = g(x) = h(x) = 0, whenever x is irrational; 

f(x) = 1 and g(x) = x , whenever x is rational; 

h(x) = l /n, if x is the rational number mjn (in lowest terms); 
h( 0) = 1. 

Prove that /is not continuous anywhere in [0, 1 ], that g is continuous only at x = 0, and 
that h is continuous only at the irrational points in [0, 1 ]. 

4.17 For each x in [0, 1], let f{x) = x if x is rational, and let /(jc) = 1 — jc if jc is 
irrational. Prove that: 

s 

a) f(f(x)) = x for all x in [0, 1 ]. b) f(x) + /(I — x) = 1 for all x in [0, 1 ]. 
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c) /is continuous only at the point x = 

d) / assumes every value between 0 and 1 . 

e) f(x + y) — fix) — f(y) is rational for all x and y in [0, 1 ]. 

4.18 Let /be defined on R and assume that there exists at least one point x 0 in R at wh 
/is continuous. Suppose also that, for every x and y in R, / satisfies the equation 

fix + y) = f(x) + f(y). 

Prove that there exists a constant a such that f(x) = ax for all x. 

4.19 Let /be continuous on [a, b] and define # as follows :#(#) = f(a) and, for a < x < 
let g(x) be the maximum value of /in the subinterval [a, x]. Show that g is continuous 
[«, b ]. 

4.20 Let /j, . . . , f m be m real-valued functions defined on a set S in R n . Assume that e« 
f k is continuous at the point a of S. Define a new function / as follows : For each x in 
/(x) is the largest of the m numbers /(x), . . . , f m (x). Discuss the continuity of / at a. 

4.21 Let /: S -► R be continuous on an open set S in R n , assume that p e 5, and assu 

that/(p) > 0. Prove that there is an /7-ball B( p; r) such that /(x) > 0 for every x in 

ball. 

4.22 Let / be defined and continuous on a closed set S in R. Let 

A = {x : x e S and f(x) = 0}. 

Prove that A is a closed subset of R. 

4.23 Given a function /: R -► R, define two sets A and B in R 2 as follows: 

A = {(■*, y ) : y < fix)}, B = {(*, y) : y > /(*)}. 

Prove that / is continuous on R if, and only if, both A and B are open subsets of R 2 . 

4.24 Let /be defined and bounded on a compact interval S in R. If T ^ 5, the num 

Cl f (T) = sup {fix) - f{y) :xe T,yeT} 

is called the oscillation (or span) of / on T. If e 5, the oscillation of /at x is definec 
be the number 


co f ix) = lim QfiBix ; h) n S). 

/I-+0 + 

Prove that this limit always exists and that co f ix) = 0 if, and only if, /is continuous a 

4.25 Let /be continuous on a compact interval [a, b]. Suppose that / has a local rr 
imum at x { and a local maximum at x 2 . Show that there must be a third point betw 
x j and x 2 where /has a local minimum. 

note. To say that /has a local maximum at .v, means that there is a 1-ball /?(*,) s 
that f{x) < fixi) for all x in /?(*,) r\ [flf, b\ Local minimum is similarly defined. 

4.26 Let / be a real-valued function, continuous on [0, 1 ], with the following prope 
For every real y , either there is no x in [0, 1 ] for which f(x) = y or there is exactly 
such x. Prove that /is strictly monotonic on [0, 1 ]. 

4.27 Let / be a function defined on [0, 1 ] with the following property : For every 
number y f either there is no x in [0, 1 ] for which fix) = y or there are exactly two va 
of x in [0, 1 ] for which f(x) = y. 
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a) Prove that / cannot be continuous on [0, 1 ]. 

b) Construct a function / which has the above property. 

c) Prove that any function with this property has infinitely many discontinuities on 

[ 0 , 1 ]. 

4.28 In each case, give an example of a function /, continuous on S and such that 
f(S) = r, or else explain why there can be no such /: 

s. 


a) S = 

(0, 1), 

T = (0, 1], 

b) S = 

(0, I). 

T = (0, 1) u (1, 2). 

c) S = 

R 1 , 

T = the set of rational numbers 

d) S = 

[0,1] u [2,3], 

II 

o 

s# 

Nm* 

• 

e) S = 

[0, 1] x [0, 1], 

T = R 2 . 

f) 5 = 

[0, 1] x [0, 1], 

T = (0, 1) x (0, 1). 

g) 5 = 

(0, 1) x (0, 1), 

T = R 2 . 


Continuity in metric spaces 

In Exercises 4.29 through 4.33, we assume that f:S-> T is a function from one metric 
space (S, d s ) to another ( T ’ d T ). 

4.29 Prove that / is continuous on S if, and only if, 

/ -1 (int B) £ int f~ l (B) for every subset B of T. 

4.30 Prove that /is continuous on S if, and only if, 

f(A) ^ f(A) for every subset A of S. 

4.31 Prove that / is continuous on S if, and only if, f is continuous on every compact 
subset of S. Hint. If x„ -> p in S , the set { p , x u x 2 , . . . } is compact. 

4.32 A function / : S -> T is called a closed mapping on S if the image f(A) is closed in T 
for every closed subset A of S. Prove that / is continuous and closed on S if, and only 
if, f(A) = /(A) for every subset A of S. 

4.33 Give an example of a continuous / and a Cauchy sequence {x n } in some metric 
space S for which {/(a* w )} is not a Cauchy sequence in T. 

4.34 Prove that the interval (-1, 1) in R 1 is homeomorphic to R l . This shows that 
neither boundedness nor completeness is a topological property. 

4.35 Section 9.7 contains an example of a function /, continuous on [0, 1 ], with 
/([0, 1 ]) = [0, 1 ] x [0, 1 ]. Prove that no such / can be one-to-one on [0, 1 ]. 


Connectedness 

4.36 Prove that a metric space S is disconnected if, and only if, there is a nonempty subset 
A of 5, A ^ S , which is both open and closed in S. 

4.37 Prove that a metric space S is connected if, and only if, the only subsets of S which 
are both open and closed in S are the empty set and S itself. 

4.38 Prove that the only connected subsets of R are (a) the empty set, (b) sets consisting 
of a single point, and (c) intervals (open, closed, half-open, or infinite). 
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4.39 Let A" be a connected subset of a metric space S. Let Y be a subset of S such that 
X £ Y £ X, where X is the closure of X. Prove that Y is also connected. In particular, 
this shows that X is connected. 

4.40 If x is a point in a metric space S, let U(x) be the component of S containing x. 
Prove that U(x) is closed in S . 

4.41 Let S be an open subset of R. By Theorem 3.1 1, S is the union of a countable dis- 
joint collection of open intervals in R. Prove that each of these open intervals is a com- 
ponent of the metric subspace S. Explain why this does not contradict Exercise 4.40. 

4.42 Given a compact set S in R m with the following property: For every pair of points 
a and b in S and for every e > 0 there exists a finite set of points {x 0 , Xj, . . . , x n } in S 
with x 0 = a and x n = b such that 

|| x k - x k _! || < e for k = 1, 2, . . . , n. 

Prove or disprove: S is connected. 

4.43 Prove that a metric space S is connected if, and only if, every nonempty proper 
subset of S has a nonempty boundary. 

4.44 Prove that every convex subset of R" is connected. 

4.45 Given a function f : R" -► R m which is one-to-one and continuous on R". If A is 
open and disconnected in R", prove that i{A) is open and disconnected in f(R"). 

4.46 Let A = {(*, y) : 0 < x < 1, y = sin 1 /jc}, B = {(*, y) : y = 0, — 1 < x < 0}, 

and let S = A kj B. Prove that S is connected but not arcwise connected. (See Fig. 4.5, 
Section 4.18.) 

4.47 Let F = {F 1? F 2 , . . . } be a countable collection of connected compact sets in R" 
such that F k+i £ F k for each k > 1 . Prove that the intersection Q JL j F k is connected 
and closed. 

4.48 Let S be an open connected set in R". Let T be a component of R" — S . Prove that 
R n — T is connected. 

4.49 Let (5, d) be a connected metric space which is not bounded. Prove that for every 
a in S and every r > 0, the set {x : d(x, a) = r } is nonempty. 


Uniform continuity 

4.50 Prove that a function which is uniformly continuous on S is also continuous on S . 

4.51 If f(x) = x 2 for x in R, prove that /is not uniformly continuous on R. 

4.52 Assume that / is uniformly continuous on a bounded set S in R". Prove that / must 
be bounded on S. 

4.53 Let f be a function defined on a set S in R n and assume that f(S) £ R m . Let g be 
defined on f(S) with value in R k , and let h denote the composite function defined by 
h( x ) = g[f(x)] if x e 5. If f is uniformly continuous on S and if g is uniformly continuous 
on f(S), show that h is uniformly continuous on S. 

454 Assume f:S-+ T is uniformly continuous on 5, where S and T are metric spaces. 
If {*„} is any Cauchy sequence in S , prove that {/(*„)} is a Cauchy sequence in T. (Com- 
pare with Exercise 4.33.) 
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4.55 Let / : S -► T be a function from a metric space S to another metric space T. 
Assume /is uniformly continuous on a subset ^4 of S and that T is complete. Prove that 
there is a unique extension of / to A which is uniformly continuous on A. 

4.56 In a metric space (S, d ), let 4 be a nonempty subset of S. Define a function 
f A : S -► R by the equation 


/*(*) = inf {</(*, y) : y e ,4 } 

for each x: in 5. The number f A (x) is called the distance from jc to A. 

a) Prove that f A is uniformly continuous on S . 

b) Prove that A = {x : x e 5 and f A (x) = 0}. 

4.57 In a metric space (5, */), let A and B be disjoint closed subsets of S. Prove that there 
exist disjoint open subsets U and V of S such that A £ U and B ^ V. Hint. Let 
g{x) = /*(*) — yi(x), in the notation of Exercise 4.56, and consider g~ 1 i—oo, 0) and 
^(O, +oo). 


Discontinuities 

4.58 Locate and classify the discontinuities of the functions / defined on R 1 by the follow- 
ing equations: 

if x * 0,/(0) = 0. 

if jc * 0,/(0) = 0. 
if jc * 0,f(0) = 0. 
if jc * 0,/(0) = 0. 

4.59 Locate the points in R 2 at which each of the functions in Exercise 4.11 is not con- 
tinuous. 


a) f(x) = (sin x)/x 

b) f{x) = e 1/x 

c) f{x) = e 1/x + sin (1/jc) 

d) fix) = 1/(1 - e 1/x ) 


Monotonic functions 

4.60 Let /be defined in the open interval (a, b ) and assume that for each interior point jc 
of («, b) there exists a 1-ball B(x) in which /is increasing. Prove that /is an increasing 
function throughout (a, b). 

4.61 Let /be continuous on a compact interval [a, b] and assume that / does not have a 
local maximum or a local minimum at any interior point. (See the note following 
Exercise 4.25.) Prove that / must be monotonic on [a, b]. 

4.62 If /is one-to-one and continuous on [ar, 6], prove that / must be strictly monotonic 
on [a, b]. That is, prove that every topological mapping of [a y b ] onto an interval [c, d] 
must be strictly monotonic. 

4.63 Let /be an increasing function defined on [a, b] and let x l9 . . . , x„ be n points in 
the interior such that a < jc 2 < x 2 < • • • < x n < b. 

a) Show that ££ =1 [/(**+) -/(*»-)] < f(b~) - f(a+). 

b) Deduce from part (a) that the set of discontinuities of / is countable. 

c) Prove that /has points of continuity in every open subinterval of [a, b], 

4.64 Give an example of a function /, defined and strictly increasing on a set S in R, such 
that Z” 1 is not continuous on f(S). 
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4,65 Let / be strictly increasing on a subset S of R. Assume that the image f(S) has one 
of the following properties: (a) f(S) is open; (b) f(S) is connected; (c) f(S) is closed. Prove 
that / must be continuous on S. 


Metric spaces and fixed points 

4 .66 Let B(S) denote the set of all real-valued functions which are defined and bounded 
on a nonempty set S. If f e B(S ), let 

11/ II = sup |/(x)|. 

xeS 

The number ||/|| is called the “sup norm” of /. 

a) Prove that the formula d{f \ g) — ||/ — g\\ defines a metric d on B(S). 

b) Prove that the metric space (5(5), d) is complete. Hint. If {f n } is a Cauchy 
sequence in B(S), show that {/,(*)} is a Cauchy sequence of real numbers for each x in S. 

4.67 Refer to Exercise 4.66 and let C(S) denote the subset of B(S) consisting of all func- 
tions continuous and bounded on S , where now S is a metric space. 

a) Prove that CCS) is a closed subset of 5(5). 

b) Prove that the metric subspace C{S) is complete. 

4.68 Refer to the proof of the fixed-point theorem (Theorem 4.48) for notation. 

a) Prove that d{p, p n ) < d(x,f(x))<x n l(\ — a). 

This inequality, which is useful in numerical work, provides an estimate for the distance 
from p n to the fixed point p. An example is given in (b). 

b) Take /( x) = \{x + 2/jc), S = [1, + oo). Prove that /is a contraction of S with 
contraction constant a = \ and fixed point p = yjl. Form the sequence {p n } 
starting with x = p 0 = 1 and show that | p n — \!l\ < 2" n . 

4.69 Show by counterexamples that the fixed-point theorem for contractions need not 
hold if either (a) the underlying metric space is not complete, or (b) the contraction 
constant a > 1 . 

4.70 Let /: S -► S be a function from a complete metric space ( S , d) into itself. Assume 
there is a real sequence {a n } which converges to 0 such that */(/"(*), /"(jO) < a n d( jc, y) 
for all n > 1 and all x, y in S, where f n is the «th iterate of /; that is, 

f l {x) = f(x), f n+l (x) = /(/"(*)) for n>\. 

Prove that /has a unique fixed point. Hint. Apply the fixed-point theorem to f m for a 
suitable m. 


4.71 Let f:S-> S be a function from a metric space ( S , d) into itself such that 


whenever x ^ y. 


d(f(x), f(y)) < d{x,y) 


a) Prove that / has at most one fixed point, and give an example of such an / with no 
fixed point. 

b) If S is compact, prove that / has exactly one fixed point. Hint. Show that 
g{x) — d(x,f( jc)) attains its minimum on S. 

c) Give an example with S compact in which / is not a contraction. 
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4.72 Assume that / satisfies the condition in Exercise 4.71. If x e S, let p 0 = x , 
Pn+ 1 = f(Pn), and c n = d(p n ,p n+l ) for n > 0. 

a) Prove that {c n } is a decreasing sequence, and let c = lim c n . 

b) Assume there is a subsequence { p kin >} which converges to a point q in S. Prove 
that 

* = d{qj(q)) = d{j\q)J[f(q)}). 

Deduce that q is a fixed point of / and that p n -> q . 
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5.1 INTRODUCTION 

This chapter treats the derivative, the central concept of differential calculus. T 
different types of problem — the physical problem of finding the instantanet 
velocity of a moving particle, and the geometrical problem of finding the tangi 
line to a curve at a given point — both lead quite naturally to the notion of deri 
tive. Here, we shall not be concerned with applications to mechanics and geomet 
but instead will confine our study to general properties of derivatives. 

This chapter deals primarily with derivatives of functions of one real varial 
specifically, real-valued functions defined on intervals in R. It also discus 
briefly derivatives of vector-valued functions of one real variable, and par 
derivatives, since these topics involve no new ideas. Much of this material sho 
be familiar to the reader from elementary calculus. A more detailed treatment 
derivative theory for functions of several variables involves significant chan 
and is dealt with in Chapter 12. 

The last part of the chapter discusses derivatives of complex-valued functii 
of a complex variable. 

5.2 DEFINITION OF DERIVATIVE 

If f is defined on an open interval (a, b), then for two distinct points jc and ( 
(a, b ) we can form the difference quotient 

/(*) - f(c ) 

x — c 

We keep c fixed and study the behavior of this quotient as x -> c. 

Definition 5.1. Let fbe defined on an open interval (a, b), and assume that c e (a, 
Then f is said to be differentiable at c whenever the limit 

lim M^M 

x-+c X — C 

exists. The limit , denoted by f\c ), is called the derivative of f at c. 

This limit process defines a new function /', whose domain consists of th 
points in (a 9 b) at which / is differentiable. The function /' is called the j 
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derivative of f Similarly, the nth derivative of /, denoted by / (n) , is defined to be 
the first derivative of for n = 2, 3, ... . (By our definition, we do not 

consider / (n) unless / (n “ 1) is defined on an open interval.) Other notations with 
which the reader may be familiar are 


f'(c) = Df(c) = 



x = c 


[where y = /(*)], 


or similar notations. The function / itself is sometimes written / (0) . The process 
which produces /' from / is called differentiation. 


5.3 DERIVATIVES AND CONTINUITY 

The next theorem makes it possible to reduce some of the theorems on derivatives 
to theorems on continuity. 

Theorem 5.2. If f is defined on (a, b) and differentiable at a point c in (a, b), then 
there is a function f* (depending on f and on c ) which is continuous at c and which 
satisfies the equation 

fix) ~ f(c) = (x - c)f*(x), (1) 

for all x in (a, b), with f*(c) = f'(c). Conversely, if there is a function /*, con- 
tinuous at c, which satisfies (1), then f is differentiable at c and f(c) = f*(c). 

Proof. If f(c) exists, let /* be defined on (a, b ) as follows: 

f*( x ) = /(x) ~ /(c) if x # c, f*(c) = f(c). 

X — c 

Then /* is continuous at c and (1) holds for all x in (a, b). 

Conversely, if (1) holds for some /* continuous at c, then by dividing by x — c 
and letting x -* c we see that f(c) exists and equals f*(c). 

- As an immediate consequence of (1) we obtain: 

Theorem 5.3. Iff is differentiable at c, then f is continuous at c. 

Proof. Let x -*• c in (1). 

note. Equation (1) has a geometric interpretation which helps us gain insight 
into its meaning. Since /* is continuous at c,f*(x) is nearly equal to f*(c) = f(c) 
if x is near c. Replacing f*(x) by f(c) in (1) we obtain the equation 

fix) = /(c) + f'(c)(x - c), 

which should be approximately correct when x — c is small. In other words, if /is 
differentiable at c, then /is approximately a linear function near c. (See Fig. 5.1). 
Differential calculus continually exploits this geometric property of functions. 
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Figure 5.1 


5.4 ALGEBRA OF DERIVATIVES 

The next theorem describes the usual formulas for differentiating the sum, differ- 
ence, product and quotient of two functions. 

Theorem 5.4. Assume f and g are defined on (a, b ) and differentiable at c. Then 
f + g,f — g, andf • g are also differentiable at c. This is also true of f/g if g(c) ^ 0. 
The derivatives at c are given by the following formulas: 

a) (/ ± g)'(c) = /'(c) ± g'(c), 

b) (/• g)'(c) = f(c)g'(c) + f'(c)g(c), 

c) (flg)'(c) = g(c)/ ' (c) ~ /(c)g ' (c) , provided g(c) * 0. 

gif 

Proof We shall prove (b). Using Theorem 5.2 we write 

fix) = /(c) + (x - c)f*(x), g(x) = g(c) + (x - c)g*(x). 

Thus, 

f(x)g(x) - f(c)g(c) = (x - c)[f(c)g*(x) + f*(x)g(c)] + (x - cf f*(x)g*(x). 

Dividing by x — c and letting x -* c we obtain (b). Proofs of the other statements 
are similar. 

From the definition we see at once that if / is constant on (a, b) then /' = 0 
on (a, b). Also, if f(x) = x, then fix) = 1 for all x. Repeated application of 
Theorem 5.4 tells us that if fix) = x" (n a positive integer), then fix) = nx”~ l 
for all x. Applying Theorem 5.4 again, we see that every polynomial has a deriva- 
tive everywhere in R and every rational function has a derivative wherever it is 
defined. 

5.5 THE CHAIN RULE 

A much deeper result is the so-called chain rule for differentiating composite func- 
tions. 
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Theorem 5.5 ( Chain rule). Let f be defined on an open interval 5, let g be defined on 
f(S) 9 and consider the composite function g of defined on S by the equation 

(g °/)0) = #[/(*)]• 

Assume there is a point c in S such that f{c) is an interior point of f(S). If f is 
differentiable at c and if g is differentiable at f(c ) then g °f is differentiable at c 
and we have 

(g » me ) = g'U(c)-]f'{c). 

Proof. Using Theorem 5.2 we can write 

f(pc) — /(c) = (x — for all x in S, 

where f* is continuous at c and /*(c) = /'(c). Similarly, 

g(y) - £[/(<■)] = [y - f(c)]g*(y), 

for all y in some open subinterval T of f(S) containing/(c). Here g* is continuous 
at /(c) and g* [/(c)] = /[/(c)]. 

Choosing x in S so that y — /( x) e T, we then have 

#[/(*)] - #[/(<■)] = [/(*) - /(<O]0*[/(x)] = (x - c)f*(x)g*[f(x)]. (2) 

By the continuity theorem for composite functions, 

£*[/(*)] -* g*[f(c)] = g'[f(c)] as x c. 

Therefore, if we divide by x — c in (2) and let x -* c, we obtain 

lim g[/W] ~ g[/(c) 3 = 9'[/(c)]m 

x-*c X — C 

as required. 

* 

5.6 ONE-SIDED DERIVATIVES AND INFINITE DERIVATIVES 

Up to this point, the statement that /has a derivative at c has meant that c was 
interior to an interval in which / was defined and that the limit defining /'(c) was 
finite. It is convenient to extend the scope of our ideas somewhat in order to discuss 
derivatives at endpoints of intervals. It is also desirable to introduce infinite 
derivatives, so that the usual geometric interpretation of a derivative as the slope 
of a tangent line will still be valid in case the tangent line happens to be vertical. 
In such a case we cannot prove that /is continuous at c. Therefore, we explicitly 
require it to be so. 

Definition 5.6. Let f be defined on a closed interval S and assume that f is continuous 
at the point c in S. Then f is said to have a righthand derivative at c if the righthand 
limit 

lim M^lM 

x -*c+ X — C 
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exists as a finite value , or if the limit is + oo or — oo. This limit will be denoted 
f+(c). Lefthand derivatives , denoted by fL(c ), are similarly defined. In additi 
if c is an interior point of S, then we say that f has the derivative f\c ) = +oo 
both the right - and lefthand derivatives at c are -f oo. ( The derivative f'(c) = - 
is similarly defined.) 

It is clear that / has a derivative (finite or infinite) at an interior point c if, a 
only if,/;(c) = fL (c), in which case/;(c) = fL (c) = /'(c). 



Figure 5.2 illustrates some of these concepts. At the point x t we have f+(x t ) 
— oo. At jc 2 the lefthand derivative is 0 and the righthand derivative is — 1. A1 
f(x 3 ) = -oo, fL(x 4 ) = -1, /;(x 4 ) = +1, fix 6 ) = +oo, and fl(x 7 ) = 
There is no derivative (one-sided or otherwise) at jc 5 , since / is not continue 
there. 

5.7 FUNCTIONS WITH NONZERO DERIVATIVE 

Theorem 5.7. Let f be defined on an open interval (a, b) and assume that for so, 
c in (a, b) we have f(c) > 0 or f(c) = + oo. Then there is a \-ball B(c) £ (a, 
in which 

f(x) > /(c) if x > c, and f(x) < /(c) if x < c. 

Proof. If f(c) is finite and positive we can write 

Rx) - /(c) = (x - c)f*(x), 

where /* is continuous at c and /*(c) = /'(c) > 0. By the sign preserving prope 
of continuous functions there is a 1-ball B(c) s (a, b) in which f*(x) has the sa 
sign as /*(c), and this means that fix) — /(c) has the same sign as x — c. 
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If /'(c) = +oo, there is a 1-ball B(c) in which 

> 1 whenever x # c. 

x — c 

In this ball the quotient is again positive and the conclusion follows as before. 

A result similar to Theorem 5.7 holds, of course, if /'(c) < 0 or if /'(c) = — oo 
at some interior point c of (a, b). 


5.8 ZERO DERIVATIVES AND LOCAL EXTREMA 

Definition 5.8. Let f be a real-valued function defined on a subset S of a metric 
space M, and assume a e S. Then f is said to have a local maximum at a if there is 
a ball B(a ) such that 

fix) < f(a) for all x in B(a) n S. 

If fix) > fia) for all x in Bid) n S, then f is said to have a local minimum at a. 

note. A local maximum at a is the absolute maximum of f on the subset B(a) n S. 
If / has an absolute maximum at a, then a is also a local maximum. However, / 
can have local maxima at several points in S without having an absolute maximum 
on the whole set S. 

The next theorem shows a connection between zero derivatives and local 
extrema (maxima or minima) at interior points. 

Theorem 5.9. Let f be defined on an open interval (a, b) and assume that f has a 
local maximum or a local minimum at an interior point c of (a, b). Iff has a derivative 
ifinite or infinite) at c, then /'(c) must be 0. 

Proof If /'(c) is positive or +oo, then / cannot have a local extremum at c 
because of Theorem 5.7. Similarly, /'(c) cannot be negative or — oo. However, 
because there is a derivative at c, the only other possibility is /'(c) = 0. 

The converse of Theorem 5.9 is not true. In general, knowing that /'(c) = 0 
is not enough to determine whether /has an extremum at c. In fact, it may have 
neither, as can be verified by the example fix) = x 3 and c = 0. In this case, 
/'( 0) = 0 but / is increasing in every neighborhood of 0. 

Furthermore, it should be emphasized that / can have a local extremum at c 
without /'(c) being zero. The example fix) = |jc| has a minimum at x = 0 but, 
of course, there is no derivative at 0. Theorem 5.9 assumes that /has a derivative 
(finite or infinite) at c. The theorem also assumes that c is an interior point of 
(a, b). In the example fix) = x, where a < x < b, f takes on its maximum and 
minimum at the endpoints but fix) is never zero in [a, b]. 
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5.9 ROLLE’S THEOREM 

It is geometrically evident that a sufficiently “smooth” curve which crosses the 
x-axis at both endpoints of an interval [< a , ft] must have a “turning point” some- 
where between a and ft. The precise statement of this fact is known as Rolle’s 
theorem. 

Theorem 5.10 (Rolle). Assume f has a derivative ( finite or infinite ) at each point of 
an open interval (a, ft), and assume that f is continuous at both endpoints a and ft. 
Iff (a) = /(ft) there is at least one interior point c at which f'(c) = 0. 

Proof We assume /' is never 0 in (a, ft) and obtain a contradiction. Since / is 
continuous on a compact set, it attains its maximum M and its minimum m some- 
where in [#, ft]. Neither extreme value is attained at an interior point (otherwise 
/' would vanish there) so both are attained at the endpoints. Since f(a) = /(ft), 
then m = M, and hence /is constant on [ 0 , ft]. This contradicts the assumption 
that /' is never 0 on (a, ft). Therefore f\c) = 0 for some c in ( a , ft). 


5.10 THE MEAN-VALUE THEOREM FOR DERIVATIVES 

Theorem 5.11 (Mean-Value Theorem). Assume that f has a derivative ( finite or 
infinite) at each point of an open interval (i a , ft), and assume also that f is continuous 
at both endpoints a and ft. Then there is a point c in (a, ft) such that 

m - m = f\c)(b - a). 

Geometrically, this states that a sufficiently smooth curve joining two points 
A and B has a tangent line with the same slope as the chord AB. We will deduce 
Theorem 5.1 1 from a more general version which involves two functions / and g in 
a symmetric fashion. 

Theorem 5.12 (Generalized Mean- Value Theorem). Let f and g be two functions , 
each having a derivative ( finite or infinite) at each point of an open interval (a, ft) 
and each continuous at the endpoints a and ft. Assume also that there is no interior 
point x at which both f'(x) and g'(x) are infinite. Then for some interior point c we 
have 

f(c)[g(b) - g(a )] = g\c)[f(b) - /(a)]. 
note. When g(x) = x , this gives Theorem 5.1 1. 

Proof. Let h(x) = f(x)[g(b) - g(a)\ - g(x)[f(b) - /(a)]. Then h’(x) is finite if 
both f'(x) and g\x) are finite, and h'(x) is infinite if exactly one of f'(x) or g\x) is 
infinite. (The hypothesis excludes the case of both f\x) and g'(x) being infinite.) 
Also, h is continuous at the endpoints, and h(a) — h(b) = f(a)g(b) — g(a)f(b). 
By Rolle’s theorem we have h'(c) = 0 for some interior point, and this proves the 
assertion. 
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note. The reader should interpret Theorem 5.12 geometrically by referring to the 
curve in the xy-plane described by the parametric equations x = g(t), y = f(t), 
a <, t <■ b. 

There is also an extension which does not require continuity at the endpoints. 

Theorem 5.13. Let f and g be two functions, each having a derivative {finite or 
infinite) at each point of {a, b ). At the endpoints assume that the limits f(a+), 
g(a+), f{b—) and g(b—) exist as finite values. Assume further that there is no 
interior point x at which both f'{x) and g'{x) are infinite. Then for some interior 
point c we have 

f'(c)[g(b-) - g(a + )] = g'(c)[f(b - ) - /(«+)]. 

Proof. Define new functions F and G on \a, b\ as follows : 

F(x) = f(x) and G(x) = g(x) if x e (a, b ); 

F(a) = f(a+), G(a) = g(a+), F(b) = f(b~), G(b) = g(b-). 

Then F and G are continuous on [a, 6] and we can apply Theorem 5.12 to Fand 
G to obtain the desired conclusion. 

The next result is an immediate consequence of the Mean-Value Theorem. 

Theorem 5,14. Assume f has a derivative ( finite or infinite) at each point of an open 
interval (a, b) and that f is continuous at the endpoints a and b. 

a) If f takes only positive values (finite or infinite) in (a, b), then f is strictly 
increasing on [a, 6]. 

b) If /' takes only negative values ( finite or infinite) in (a, b), then f is strictly 
decreasing on [or, b\. 

c) Iff is zero everywhere in (a, b) then f is constant on [a, b]. 

Proof. Choose x < y and apply the Mean-Value Theorem to the subinterval 
[x, y] of [a, 6] to obtain 

f(y) - fix) = f'(c)(y - x) where c e (x, y). 

All the statements of the theorem follow at once from this equation. 

By applying Theorem 5.14 (c) to the difference / — g we obtain: 

Corollary 5.15. Iff and g are continuous on \a, b] and have equal finite derivatives 
in (a, b), then f — g is constant on [a, 6]. 


5.11 INTERMEDIATE-VALUE THEOREM FOR DERIVATIVES 

In Theorem 4.-33 we proved that a function / which is continuous on a compact 
interval [a, b\ assumes every value between its maximum and its minimum on 
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the interval. In particular,/ assumes every value between f(a) and f(b). A similar 
result will now be proved for functions which are derivatives. 

Theorem 5.16 (Intermediate-value theorem for derivatives). Assume that f is de- 
fined on a compact interval [ a 9 b~\ and that f has a derivative ( finite or infinite) at each 
interior point . Assume also that f has finite one-sided derivatives f+(a) and fL{b) at 
the endpoints y with f+(a) ^ fL(b). Then , if c is a real number between f+(a) and 
fL(b)y there exists at least one interior point x such that f'(x) = c. 

Proof. Define a new function g as follows: 

g(x ) = — — — if X # a, g(a ) = f+(a). 

x — a 

Then g is continuous on the closed interval [a, b\ By the intermediate-value 
theorem for continuous functions, g takes on every value between f+{a) and 
[/(6) — — a ) in the interior (a, b ). By the Mean-Value Theorem, we have 

g(x) = f'(k) for some k in (a, x) whenever x e (a, b). Therefore f takes on every 
value between f+(a) and \_f{b) — /(a)]/(6 — a) in the interior (a, b). A similar 
argument applied to the function h, defined by 

h(x ) = ~ if x # b, h(b ) = fL(b), 

x — b 

shows that /' takes on every value between [/(£>) — /(a)]/(6 — a) and f'-(b) in the 
interior (a, b). Combining these results, we see that /' takes on every value between 
f+{a) and fL{b) in the interior (a, b ), and this proves the theorem. 

note. Theorem 5.16 is still valid if one or both of the one-sided derivatives 
f+(a), fL(b), is infinite. The proof in this case can be given by considering the 
auxiliary function g defined by the equation g(x) = f(x) — cx, if x e [a, b\ 
Details are left to the reader. 

The intermediate-value theorem shows that a derivative cannot change sign 
in an interval without taking the value 0. Therefore, we have the following 
strengthening of Theorem 5.14(a) and (b). 

Theorem 5.17. Assume f has a derivative (_ finite or infinite) on {a, b) and is con- 
tinuous at the endpoints a and b. If f'(x) # 0 for all x in (a, b) then f is strictly 
monotonic on [a, b~\. 

The intermediate-value theorem also shows that monotonic derivatives are 
necessarily continuous. 

Theorem 5.18. Assume f exists and is monotonic on an open interval (a, b ). Then 
f is continuous on (a, b). 

Proof We assume /' has a discontinuity at some point c in (a, b) and arrive at a 
contradiction. Choose a closed subinterval [a, /?] of (a, b) which contains c in its 
interior. Since /' is monotonic on [a, /?] the discontinuity at c must be a jump 
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discontinuity (by Theorem 4.51). Hence /' omits some value between /'(a) and 
fifty, contradicting the intermediate-value theorem. 

5.12 TAYLOR’S FORMULA WITH REMAINDER 

As noted earlier, if/ is differentiable at c 9 then /is approximately a linear function 
near c. That is, the equation 

/(*) = fi c ) + f'(c)(x - c), 

is approximately correct when x — c is small. Taylor’s theorem tells us that, more 
generally,/ can be approximated by a polynomial of degree n — 1 if /has a deriva- 
tive of order n. Moreover, Taylor’s theorem gives a useful expression for the error 
made by this approximation. 

Theorem 5.19 (Taylor). Let f be a function having finite nth derivative / (n) every- 
where in an open interval {a, b) and assume that / (n_1) is continuous on the closed 
interval [ a , b\ Assume that c e [ a 9 b\ Then 9 for every x in [a 9 6], x ^ c 9 there 
exists a point x x interior to the interval joining x and c such that 

m = m + ’i; („-„>• + <« _ e y. 

k = i kl n\ 

Taylor’s theorem will be obtained as a consequence of a more general result 
that is a direct extension of the generalized Mean-Value Theorem. 

Theorem 5.20. Let f and g be two functions having finite nth derivatives f m) and 
g (n) in an open interval (a, b ) and continuous ( n - l)s/ derivatives in the closed 
interval [a, b~\. Assume that c e [a, £>]. Then, for every x in [a, b\ x ^ c, there 
exists a point x x interior to the interval joining x and c such that 

[/(*) - (x _ c) ‘] 0 <B)(Xi) = [**> - % (* ~ c >*] • 

note. For the special case in which g(x) = (x — cf, we have g w (c) = 0 for 
0 <* k <* n — 1 and^ (B) (x) = n\. This theorem then reduces to Taylor’s theorem. 

Proof. For simplicity, assume that c < b and that x > c. Keep x fixed and define 
new functions F and G as follows : 

m = m + 2 (x - tf, 

k= 1 kl 

Git) = git) + 2 (* - tf, 

k= 1 kl • 

for each t in [e, x]. Then F and G are continuous on the closed interval [c, x] 
and have finite derivatives in the open interval ( c , x). Therefore, Theorem 5.12 is 
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F'(xi)[G(jc) - <j(c)] = G'(JCi)[F(x) - F(c)], where x t e (c, x). 

This reduces to the equation 

F'ixMx) - G(c)] = G'(at,)[/W - F(c)l (a) 


since G(x) = g{x) and F(x) = f(x). If, now, we compute the derivative of the sum 
defining F(t), keeping in mind that each term of the sum is a product, we find that 
all terms cancel but one, and we are left with 


Similarly, we obtain 


F'(t) = ^ 

(n - 1)! 


G'(t) = ^ g (n) (0- 
(n - 1)! 


If we put t = x t and substitute into (a), we obtain the formula of the theorem. 


5.13 DERIVATIVES OF VECTOR-VALUED FUNCTIONS 

Let f : (a, b) -+ R" be a vector-valued function defined on an open interval (a, b ) 
in R. Then f = (f u . . . ,/„) where each component f k is a real- valued function 
defined on (a, b). We say that f is differentiable at a point c in (a, b ) if each com- 
ponent f k is differentiable at c and we define 

m = my 

In other words, the derivative f'(c) is obtained by differentiating each component 
of f at c. In view of this definition, it is not surprising to find that many of the 
theorems on differentiation are also valid for vector-valued functions. For example, 
if f and g are vector-valued functions differentiable at c and if A is a real-valued 
function differentiable at c, then the sum f + g, the product Af, and the dot product 
f • g are differentiable at c and we have 

(f + g)'(c) = f '(c) + g'(c), 

(Af)'(c) = A'(c)f(c) + A(c)f'(c), 

(f-g)'(c) = f '(c) • g(c) + f(c)-g'(c). 

The proofs follow easily by considering components. There is also a chain rule for 
differentiating composite functions which is proved in the same way. If f is vector- 
valued and if u is real-valued, then the composite function g given by g(x) = 
f[«(jc)] is vector-valued. The chain rule states that 

g'(c) = f'[>(c)]w'(c), 

if the domain of f contains a neighborhood of u(c) and if u'(c) and f'[w(c)] both 
exist. 
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The Mean-Value Theorem, as stated in Theorem 5.1 1, does not hold for vector- 
valued functions. For example, if f (t) = (cos t, sin t) for all real t, then 

f(27t) - f(0) = 0, 

but f'(0 is never zero. In fact, |[f'(?)ll = 1 for all t. A modified version of the 
Mean-Value Theorem for vector-valued functions is given in Chapter 12 (Theorem 
12.8). 

5.14 PARTIAL DERIVATIVES 

Let S be an open set in Euclidean space R", and let / : S -*■ R be a real-valued 
function defined on 5. If x = (x u . . . , x„) and c = (c t , . . . , cj are two points 
of S having corresponding coordinates equal except for the Ath, that is, if = c, 
for / # A and if x k ^ c k , then we can consider the limit 

lim Ax) ~ /(c> . 

Xk~*Ck Xk Ck 

When this limit exists, it is called the partial derivative of /with respect to the Ath 
coordinate and is denoted by 

DJ{ c), A(c), ^ (c), 

cx k 

or by a similar expression. We shall adhere to the notation Dj/(c). 

This process produces n further functions D,/ Z) 2 /, . . . , D M f defined at those 
points in 5 where the corresponding limits exist 

Partial differentiation is not really a new concept. We are merely treating 
fix j, . . . , x„) as a function of one variable at a time, holding the others fixed. 
That is, if we introduce a function g defined by 

di^k) ~ fi^ 1> • • • j ^l-li ^4) • • • > Oi 

then the partial derivative /)*/( c) is exactly the same as the ordinary derivative 
g'(c k ). This is usually described by saying that we differentiate / with respect to 
the Ath variable, holding the others fixed. 

In generalizing a concept from R 1 to R", we seek to preserve the important 
properties in the one-dimensional case. For example, in the one-dimensional case, 
the existence of the derivative at c implies continuity at c. Therefore it seems 
desirable to have a concept of derivative for functions of several variables which 
will imply continuity. Partial derivatives do not do this. A function of n variables 
can have partial derivatives at a point with respect to each of the variables and yet 
not be continuous at the point. We illustrate with the following example of a 
function of two variables : 




if x = 0 or y = 0, 
otherwise. 
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The partial derivatives D x f{ 0, 0) and D 2 /(0 , 0) both exist. In fact, 


D , /(0, 0) = lim 


/(*, 0) ~ /(0> 0) 
x — 0 


lim- 

0 X 


l, 


and, similarly, D 2 f(0, 0) = 1 . On the other hand, it is clear that this function is 
not continuous at (0, 0). 

The existence of the partial derivatives with respect to each variable separately 
implies continuity in each variable separately ; but, as we have just seen, this does 
not necessarily imply continuity in all the variables simultaneously.— The difficulty 
with partial derivatives is that by their very definition we are forced to consider 
only one variable at a time. Partial derivatives give us the rate of change of a 
function in the direction of each coordinate axis. There is a more general concept of 
derivative which does not restrict our considerations to the special directions of 
the coordinate axes. This will be studied in detail in Chapter 12. 

The purpose of this section is merely to introduce the notation for partial 
derivatives, since we shall use them occasionally before we reach Chapter 12. 

If /has partial derivatives D x f , . . . , D n f on an open set S, then we can also 
consider their partial derivatives. These are called second-order partial derivatives. 
We write D rk f for the partial derivative of D k f with respect to the rth variable. 
Thus, 


D r J = D r (DJ). 


Higher-order partial derivatives are similarly defined. Other notations are 




d 3 f 

dx p dx q 8x r 


5.15 DIFFERENTIATION OF FUNCTIONS OF A COMPLEX VARIABLE 

In this section we shall discuss briefly derivatives of complex-valued functions 
defined on subsets of the complex plane. Such functions are, of course, vector- 
valued functions whose domain and range are subsets of R 2 . All the considerations 
of Chapter 4 concerning limits and continuity of vector-valued functions apply, 
in particular, to functions of a complex variable. There is, however, one essential 
difference between the set of complex numbers C and the set of n-dimensional 
vectors R" (when n > 2) that plays an important role here. In the complex number 
system we have the four algebraic operations of addition, subtraction, multiplica- 
tion, and division, and these operations satisfy most of the “usual” laws of algebra 
that l\old for the real number system. In particular, they satisfy the first five 
axioms for real numbers listed in Chapter 1. (Axioms 6 through 10 involve the 
ordering relation <, which cannot exist among the complex numbers.) Any 
algebraic system which satisfies Axioms 1 through 5 is called a field. (For a 
thorough discussion of fields, see Reference 1.4.) Multiplication and division, it 
turns out, cannot be introduced in R" (for n > 2) in such a way that R" will 
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become a fieldf which includes C. Since division is possible in C, however, we can 
form the fundamental difference quotient [/(z) - /(c)]/(z - c ) which was used 

to define the derivative in R, and it now becomes clear how the derivative should be 
defined in C. 

Definition 5.21. Let f be a complex-valued function defined on an open set S in C, 
and assume c e S. Then f is said to be differentiable at c if the limit 

lim M-M . f(c) 

z-*c Z — C 

exists. 

By means of this limit process, a new complex-valued function /' is defined at 
those points z of S where f'{z) exists. Higher-order derivatives /", /'", . . . are, 
of course, similarly defined. 

The following statements can now be proved for complex-valued functions 
defined on an open set S by exactly the same proofs used in the real case : 

a) f is differentiable at c if, and only if, there is a function f* , continuous at c, such 
that 

m - Ac) - (z - c)/*( Z ), 

for all z in S, with f*(c) = /'(c). 

note. If we let g(z) = f*(z) - /'(c) the equation in (a) can be put in the form 

A z ) = fie) + f'(c)(z - c) + g(z)(z - c), 

where g(z) -+ 0 as z -> c. This is called a first-order Taylor formula for/. 

b) Iff is differentiable at c, then f is continuous at c. 

c) If two functions f and g have derivatives at c, then their sum, difference, product, 
and quotient also have derivatives at c and are given by the usual formulas (as in 
Theorem 5.4). In the case of fjg, we must assume g(c) # 0. 

d) The chain rule is valid; that is to say, we have 

iff °f)'(c) = glf{c)-\f'{c), 

if the domain of g contains a neighborhood of /(c) and if /'(c) and #'[/(c)] both 
exist. 

When/(z) = z, we find /'(z) = 1 for all z in C. Using (c) repeatedly, we find 
that /'(z) = nz n ~ l when/(z) = z” (n is a positive integer). This also holds when 


t For example, if it were possible to define multiplication in R 3 so as to make R 3 a field 
including C, we could argue as follows: For every x in R 3 the vectors 1, x, x 2 , x 3 would 
be linearly dependent (see Reference 5.1, p. 558). Hence for each x in R 3 , a relation of 
the form a 0 + ays. + ays 2 + a 3 s 3 = 0 would hold, where a 0 , a u a 2 , a 3 are real 
numbers. But every polynomial of degree three with real coefficients is a product of a 
linear polynomial and a quadratic polynomial with real coefficients. The only roots such 
polynomials can have are either real numbers or complex numbers. 
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n is a negative integer, provided z # 0. Therefore, we may compute derivatives 
of complex polynomials and complex rational functions by the same techniques 
used in elementary differential calculus. 

5.16 THE CAUCHY-RIEMANN EQUATIONS 

If/is a complex- valued function of a complex variable, we can write each function 
value in the form 

f{z) = w(z) + iv(z), 

where u and v are real-valued functions of a complex variable. We can, of course, 
also consider u and v to be real-valued functions of two real variables and then 
we write 

f(z) = u(x, y ) + iv(x, y), if z — x + iy. 

In either case, we write / = u + iv and we refer to u and v as the real and imag- 
inary parts of /. For example, in the case of the complex exponential function /, 
defined by 

f(z) = e z = e x cos y f ie x sin y, 
the real and imaginary parts are given by 

u(x, y) = e x cos y, v(x, y) = e x sin y. 

Similarly, when /(z) = z 2 = (x + iy) 2 , we find 

m(x, y) — x 2 y 2 , v(x, y) = 2 xy. 

In the next theorem we shall see that the existence of the derivative/' places a 

/ 

rather severe restriction on the real and imaginary parts u and v. 

Theorem 5.22 . Let f = u + iv be defined on an open set S in C. If f\c) exists for 
some c in S, then the partial derivatives D x u{c ), D 2 u(c), D{v{c) and D 2 v(c) also 
exist and we have 


f\c ) = 

Z>i«(c) + i D^vic), 

(3) 

f\c) = 

D 2 v(c) — i D 2 u(c). 

(4) 


This implies , in particular , that 

D x u(c) — D 2 v(c) and D x v{c ) — —D 2 u(c). 

note. These last two equations are known as the Cauchy- Riemann equations. 
They are usually seen in the form 

du dv dv _ du 

j — • 

dx dy ox dy 

Proof. Since f'{c) exists there is a function f* defined on S such that 

/(z) - /(c) = (z - c)/*(z), 


(5) 
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where /* is continuous at c and f*(c) = f'(c). Write 

z = x + iy 9 c = a + ib, and /*(z) = A(z ) + /J?(z), 

where A(z) and B(z) are real. Note that A(z) -* A(c ) and B(z) -* B(c) as z -* c. 
By considering only those z in S with y = b and taking real and imaginary parts 
of (5), we find 

u(x , b) — u(a, b) = (x — a)A{x + ib) 9 v(x y b) — v(a y b) = (x — a)B{x + ib). 

Dividing by x — a and letting x -* a we find 

D y u{c) = A(c) and D x u(c) = B(c ). 

Since /'(c) = + iB{c\ this proves (3). 

Similarly, by considering only those z in S with jc = a we find 

B> 2 v ( c ) = A(c) and D 2 u(c) = —5(c), 

which proves (4). 

Applications of the Cauchy-Riemann equations are given in the next theorem. 

Theorem 5.23 . Let f — u + iv be a function with a derivative everywhere in an 
open disk D centered at (a, b). If any one of u, v, or \f\ is constant t on D, then 
f is constant on D. Also , / is constant if f\z) — 0 for all z in D. 

Proof Suppose u is constant on D. The Cauchy-Riemann equations show that 
D 2 v = D y v = OonD. Applying the one-dimensional Mean-Value Theorem twice 
we find, for some y' between b and y , 

v{x , y) - v(x, b) = {y - b)D 2 v(x, /) = 0, 

and, for some x' between a and x , 

v{x , 6) — v(a, b) = (x — a)D l v(x' 9 b) = 0. 

Therefore c(x, >^) = 6) for all (jc, >^) in D, so v is constant on D. A similar 

argument shows that if v is constant then u is constant. 

Now suppose |/| is constant on D. Then. |/| 2 = u 2 + v 2 is constant on D. 
Taking partial derivatives we find 

uD y u + vD x v = 0, uD 2 u + vD 2 v = 0. 

By the Cauchy-Riemann equations the second equation can be written as 

vD y u — uD t v = 0. 

Combining this with the first to eliminate D x v we find (w 2 + v 2 )D y u = 0. If 
u 2 + v 2 = 0, then u = v = 0, so / = 0. If + v 2 # 0 then = 0; hence 
u is constant, so / is constant. 


t Here |/| denotes the function whose value at z is |/(z)|. 
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Finally, if /' = 0 on D, both partial derivatives D x v and D 2 v are zero on D. 
Again, as in the first part of the proof, we find / is constant on D. 

Theorem 5.22 tells us that a necessary condition for the function / = u + iv to 
have a derivative at c is that the four partials D x u , D 2 u, D x v , D 2 v, exist at c and 
satisfy the Cauchy-Riemann equations. This condition, however, is not sufficient, 
as we see by considering the following example. 

Example. Let u and v be defined as follows: 

3 3 

u(x, >0 = — -- if (*, y) * (0, 0), u( 0, 0) = 0, 

x 2 + y 2 

3 3 

v(x, y) = x + y - if (*, y) * (0, 0), i>(0, 0) = 0. 

x 4- y z 

It is easily seen that D x u{ 0, 0) = D^v{ 0, 0) = 1 and that D 2 u(0, 0) = —D 2 v( 0, 0) = — 1, 
so that the Cauchy-Riemann equations hold at (0, 0). Nevertheless, the function / = 
u + iv cannot have a derivative at z = 0. In fact, for x = 0, the difference quotient 
becomes 

f(z) - /( 0) = -y + iy _ j + . 
z — 0 iy 

whereas for x = y, it becomes 

f(z) - /( 0) = xi = 1 + i . 

z — 0 x + ix 2 


and hence /'(()) cannot exist. 

In Chapter 12 we shall prove that the Cauchy-Riemann equations do suffice to 
establish existence of the derivative of / = u + iv at c if the partial derivatives of 
u and v are continuous in some neighborhood of c. To illustrate how this result is 
used in practice, we shall obtain the derivative of the exponential function. Let 
f{z) = e z = u + iv. Then 

u(x, y) = e? cos y, v(x, y) = e x sin y, 

and hence 

Di u (x, y) — e* cos y — D 2 v(x, y), D 2 u(x, y) = — e x sin y = — D x v(x, y). 

Since these partial derivatives are continuous everywhere in R 2 and satisfy the 
Cauchy-Riemann equations, the derivative f\z ) exists for all z. To compute it we 
use Theorem 5.22 to obtain 

f\z) = e x cos y + ie x sin y = /(z). 

Thus, the exponential function is its own derivative (as in the real case). 
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EXERCISES 


Real-valued functions 

In the following exercises assume, where necessary, a knowledge of the formulas for 
differentiating the elementary trigonometric, exponential, and logarithmic functions. 

5.1 A function / is said to satisfy a Lipschitz condition of order a at c if there exists a 
positive number M (which may depend on c) and a 1-ball B(c ) such that 

I fix) - /(c) | < M\x - c\* 

whenever x e B(c), x ^ c. 

a) Show that a function which satisfies a Lipschitz condition of order a is continuous 
at c if a > 0, and has a derivative at c if a > 1 . 

b) Give an example of a function satisfying a Lipschitz condition of order 1 at c for 
which f'(c) does not exist. 

5.2 In each of the following cases, determine the intervals in which the function / is 
increasing or decreasing and find the maxima and minima (if any) in the set where each / 
is defined. 

a ) fix) = x 3 4- ax 4- b, x e R. 

b) fix) = log (x 2 - 9), \x\ > 3. 

c ) fix) = x 2,3 (x — l) 4 , 0 < x < 1. 

d) f(x) = (sin a)/ a if x ^ 0,/(0) =1, 0 < x < n\ 2. 

5.3 Find a polynomial / of lowest possible degree such that 

fixf) = a l9 f(x 2 ) = a 2 , fix i) = b l9 f'(x 2 ) = Z> 2 , 
where x x ^ x 2 and a l9 a 2 , b i9 b 2 are given real numbers. 

5.4 Define / as follows: fix) = e~ llx2 if x ^ 0,/(0) = 0. Show that 

a) / is continuous for all x. 

b) / (n) is continuous for all jc, and that / (n) (0) = 0, in = 1,2,...). 

5.5 Define/, and h as follows :/(0) = #(0) = A(0) = Oand, ifjc ^ 0,/(jc) = sin (1/jc), 
gix) = x sin (1/jc), hix) = x 2 sin (1/jc). Show that 

a ) f ix) = — l/* 2 cos (1/a:), if x ^ 0; f'iO) does not exist. 

b) = sin (1/a) — 1/a cos (1/a), if a ^ 0; #'(0) does not exist. 

c) h'ix) = 2a sin (1/a) — cos (1/a), if a ^ 0; h'iO) = 0; 

lim^o h'ix) does not exist. 

5.6 Derive Leibnitz’s formula for the nth derivative of the product h of two functions 
/and#: 

h ir \ a) = V] f ik \x)g in ~ k \ a), where = — . 

«W. W k\in-k)\ 

5.7 Let / and # be two functions defined and having finite third-order derivatives f m ix) 
and g m ix) for all a in R. If fix) gix) — 1 for all a, show that the relations in (a), (b), (c), 
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and (d) hold at those points where the denominators are not zero : 

a) /'(*)//(*) + d'(x)/g(x) = 0. 

b) r(x)lf'(x) - 2 f'(x)/f(x) - g"(xW(x) = 0. 

c) CQ - 3 / W(*) _ 3 /*W _ = o. 

f'(x) f(x)g'(x) f{x) g'(x) 

d ) _ 3 = SW _ 3 ( g"(x) \ 2 

f'(x) 2 \f'ix)J g'(x) 2 Wix)} ‘ 

note. The expression which appears on the left side of (d) is called the Schwarzian 
derivative of / at x. 

e) Show that / and g have the same Schwarzian derivative if 

g(x) = [af(x) + b]/[cf(x) + d] 9 where ad - be ^ 0. 

Hint. If c 0, write ( af + b)/(cf + d) = (a/c) + (be — ad)j[c(cf + d)] 9 and apply 
part (d). 

5.8 Let/i,/ 2 , ^ 1 ^ 2 ^ four functions having derivatives in (a 9 b). Define F by means of 
the determinant 


F<x) = 


fi(x) fi(x) 

0iW g 2 M 


if x e ( a, b). 


a) Show that F'(x) exists for each x in (a 9 b) and that 


fiM fiM 

ff'iix) g' 2 {x) 

b) State and prove a more general result for nth order determinants. 

5.9 Given n functions f l9 . . 9 f n9 each having wth order derivatives in (a 9 b). A function 
W 9 called the Wronskian of f l9 . . . , f n9 is defined as follows: For each x in (a 9 b) 9 W( x) is 
the value of the determinant of order n whose element in the kth row and /nth column is 
/m -1) W> where k = 1, 2, . . . , n and m = 1, 2, . . . , n. [The expression /£, 0) (jt) is written 
for f m (x).] 

a) Show that W'(x) can be obtained by replacing the last row of the determinant 
defining W(x) by the nth derivatives /^(jc), . . . , f^\x). 

b) Assuming the existence of n constants c l9 ... 9 c n9 not all zero, such that 
Cifi(x) + • • • + c n f n (x) = 0 for every x in (a 9 b) 9 show that W(x) = 0 for each 
x in ( a 9 b). 

note. A set of functions satisfying such a relation is said to be a linearly dependent set 
on (a 9 b). 

c) The vanishing of the Wronskian throughout (a 9 b) is necessary, but not sufficient, 

for linear dependence of f\ 9 Show that in the case of two functions, if the 

Wronskian vanishes throughout (a 9 b) and if one of the functions does not vanish 
in (a 9 b) 9 then they form a linearly dependent set in (a 9 b). 


fi(x) f' 2 (x) 
0i(x) 0iM 
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Mean-Value Theorem 


5.10 Given a function / defined and having a finite derivative in (a, b) and such that 
lim^j,. f(x) = +oo. Prove that lim*.**,.. f'(x) either fails to exist or is infinite. 

5.11 Show that the formula in the Mean-Value Theorem can be written as follows: 


/(* + h) - f{x) 
h 


= /'(* + Oh), 


where 0 < 6 < 1 . Determine 6 as a function of x and h when 
a) f{x) = * 2 , b) f(x) = x 3 , 

c) fix) = e x , d) fix) = log jc, jc > 0. 

Keep x 0 fixed, and find lim^o 0 in each case. 

5.12 Take/(x:) = 3x 4 - 2x 3 - x 2 + 1 andp(jc) = 4jc 3 - 3jc 2 - 2jc in Theorem 5.20. 
Show that f'ix)/g'ix) is never equal to the quotient [/(l) - /(0)]/[^(l) - ^(0)] if 
0 < x < 1 . How do you reconcile this with the equation 


fib) ~ fja) _ fix i) 

gib) - gia) g\x x ) ’ 


a < jcj < b. 


obtainable from Theorem 5.20 when n — 1 ? 

5.13 In each of the following special cases of Theorem 5.20, take n = 1, c = a, x = b, 
and show that jq = ia + 6)/2. 


a) fix) = sin *, p(*) = cos b) fix) = e x , ^(jc) = <?“*. 

Can you find a general class of such pairs of functions / and g for which jcj will always be 
ia + b)/ 2 and such that both examples (a) and (b) are in this class? 

5.14 Given a function /defined and having a finite derivative /' in the half-open interval 
0 < * < 1 and such that \f\x)\ < 1. Define a n = /(1/w) for n = 1, 2, 3, ... , and show 
that lim n _ >00 a„ exists. Hint. Cauchy condition. 

5.15 Assume that /has a finite derivative at each point of the open interval ia , b). Assume 
also that lim X -+ C f'ix) exists and is finite for some interior point c . Prove that the value 
of this limit must be f'ic). 

5.16 Let /be continuous on (a, b) with a finite derivative /' everywhere in ia, b), except 
possibly at c. If lim X ^ c f'ix) exists and has the value A, show that f'ic) must also exist 
and have the value A. 

5.17 Let /be continuous on [0, 1 ], /(0) = 0, f\x) finite for each jc in (0, 1). Prove that 
iff' is an increasing function on (0, 1), then so too is the function g defined by the equa- 
tion gix) = fix)/x. 

5.18 Assume /has a finite derivative in ia, b) and is continuous on [a, b] with fia) = 
fib) = 0. Prove that for every real X there is some c in ia, b) such that f'ic) = Xfic). 

Hint. Apply Rolle’s theorem to gix) fix) for a suitable g depending on X. 

5.19 Assume /is continuous on [a, b] and has a finite second derivative f" in the open 
interval ia, b). Assume that the line segment joining the points A = (a, fia)) and 
B = (b, fib)) intersects the graph of / in a third point P different from A and B. Prove 
that f"ic) = 0 for some c in ia, b). 
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5.20 If/has a finite third derivative f m in [a, 6] and if 

/(a) = f’(a) = m = /'(*) = 0, 
prove that /'(c) = 0 for some c in (a, b ). 

jr 

5.21 Assume / is nonnegative and has a finite third derivative f" in the open interval 
(0, 1). If/(x) = 0 for at least two values of x in (0, 1), prove that f”(c) — 0 for some c 
in (0, 1). 

5.22 Assume /has a finite derivative in some interval (a, + oo). 

a) If fix) -» 1 and f'{x) -> c as x -> + oo, prove that c = 0. 

b) If f'(x) -> 1 as x -* +oo, prove that f{x)jx -> 1 as x -> + oo. 

c) If f'(x) -> 0 as x -> +oo, prove that f(x)/x -> 0 as x -> + oo. 

5.23 Let h be a fixed positive number. Show that there is no function / satisfying the 
following three conditions : f'(x) exists for x > 0, /'( 0) = 0, f'(x) > h for x > 0. 

5.24 If h > 0 and if f'(x) exists (and is finite) for every x in {a — h, a + h\ and if / is 
continuous on [a — h, a + h], show that we have: 

a) /(° + *> ~ ~ = /'(a + 0h) + f(a - Oh), 0 < 6 < 1 ; 

h 

b) + *> ~ 2 ^ a) + ~ = f'(a + Ih) - f(a - Ih), 0 < A < 1. 

h 


c) If f”{a) exists, show that 

f\a) = lim + ~ + ~ 

/i-^o h 2 


d) Give an example where the limit of the quotient in (c) exists but where f*(a) does 
not exist. 

5.25 Let / have a finite derivative in ( a , b ) and assume that c e (#, 6). Consider the 
following condition: For every e > 0 there exists a 1-ball B(c; <5), whose radius <5 depends 
only on e and not on c, such that if x e B(c; S), and x ^ c, then 


fix) - /(c) 
x — c 


fie) 


< e. 


Show that /' is continuous on (#, b) if this condition holds throughout (a, b). 


5.26 Assume /has a finite derivative in (o, b) and is continuous on [a, b], with a < 
fix) < b for all x in [a, b] and \ f\x)\ < a < 1 for all x in (a, b). Prove that /has a 
unique fixed point in [a, b]. 


5.27 Give an example of a pair of functions / and g having finite derivatives in (0, 1), 
such that 



x-*o gix) 



but such that \\m x ^ Q f\x)l g\x) does not exist, choosing g so that g\x) is never zero. 
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5.28 Prove the following theorem: 

Let f and g be two functions having finite «th derivatives in (a, b). For some interior point c 
in (a, b), assume that f(c) = f'(c) = • • • = / (n_1) (c) = 0, and that g(c) = g\c ) = • • • 
= g {n ~ l) {c) = 0, but that g in) (x) is never zero in (a, b). Show that 


Hm ^> = ^. 

dix) g in \c) 


note. / (B) and g (n) are not assumed to be continuous at c. Hint. Let 


F{x) = fix) - 


(x - c y»- 

in - D! 


define G similarly, and apply Theorem 5.20 to the functions F and G. 

5.29 Show that the formula in Taylor’s theorem can also be written as follows : 


, x ^/ (fc) (c) . (je - c)(x - xj 1 - 1 , , 

/(x) = y ~z~ <* - ct + - — , /n*,), 


k=0 


k\ 


(n - 1)! 


where x j is interior to the interval joining* and c. Let 1 - 9 = (x - xjfix - c ). Show 
that 0 < 6 < 1 and deduce the following form of the remainder term (due to Cauchy) : 

(i _ 6 ) n -\x - cy Hn) 


in - 1)! 


f w [dx + (1 - 6)c]. 


Hint. Take G(t) = g{t) — t m the proof of Theorem 5.20. 


Vector- valued functions 


5.30 If a vector-valued function f is differentiable at c, prove that 

f’(c) = lim - [f(c + h) - f(c)]. 

h->0 h 


Conversely, if this limit exists, prove that f is differentiable at c. 


5.31 A vector-valued function f is differentiable at each point of (< a , b ) and has constant 
norm ||f ||. Prove that f(/) ■ f '(t) = 0 on ( a , b). 


5.32 A vector-valued function f is never zero and has a derivative f' which exists and is 
continuous on R. If there is a real function X such that f 'it) = X(t)f(t) for all t, prove 
that there is a positive real function u and a constant vector c such that f(/) = u(t)c 
for all t. 


Partial derivatives 

5.33 Consider the function / defined on R 2 by the following formulas : 

fix, y ) = - - Xy - if (*, y) ^ (0, 0) /( 0, 0) = 0. 

x 2 + y 2 

Prove that the partial derivatives D 1 f( x, y) and D 2 f(x^ y) exist for every (x, y) in R 2 and 
evaluate these derivatives explicitly in terms of x and y. Also, show that / is not con- 
tinuous at (0, 0). 
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5.34 Let /be defined on R 2 as follows : 

/(*, y) = y -\ ^ if (x, y) * (0, 0), /( 0,0) = 0. 

Compute the first- and second-order partial derivatives of /at the origin, when they exist. 
Complex-valued functions 

5.35 Let S be an open set in C and let S* be the set of complex conjugates z, where z e S. 
If /is defined on S, define g on S* as follows: g(z) = /(z), the complex conjugate of/(z). 
If / is differentiable at c prove that g is differentiable at c and that g'(c) = f'(c). 

5.36 i) In each of the following examples write / = u + iv and find explicit formulas 

for u( x, y) and v(x, y) : 

a) /(z) = sin z, b) /(z) = cos z, 

c)/(z)=|z|, d)/(z)=z, 

e) /(z) = arg z (z ^ 0), f) /(z) = Log z (z ^ 0), 

g) /(z) = e* 2 , h) /(z) = z® (a complex, z ^ 0). 

(These functions are to be defined as indicated in Chapter 1 .) 

ii) Show that u and v satisfy the Cauchy-Riemann equations for the following values 
of z: All z in (a), (b), (g); no z in (c), (d), (e); all z except real z < 0 in (f), (h). 
(In part (h), the Cauchy-Riemann equations hold for all z if a is a nonnegative 
integer, and they hold for all z ^ 0 if a is a negative integer.) 

iii) Compute the derivative f'(z) in (a), (b), (f), (g), (h), assuming it exists. 

5.37 Write / = u + iv and assume that /has a derivative at each point of an open disk D 
centered at (0, 0). If au 2 + bv 2 is constant on D for some real a and b , not both 0, prove 
that / is constant on D. 

\ 
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CHAPTER 6 

FUNCTIONS OF 
BOUNDED VARIATION AND 

RECTIFIABLE CURVES 


6.1 INTRODUCTION 

Some of the basic properties of monotonic functions were derived in Chapter 4. 
This brief chapter discusses functions of bounded variation, a class of functions 
closely related to monotonic functions. We shall find that these functions are 
intimately connected with curves having finite arc length (rectifiable curves). They 
also play a role in the theory of Riemann-Stieltjes integration which is developed 
in the next chapter. 

6.2 PROPERTIES OF MONOTONIC FUNCTIONS 

Theorem 6.1. Let fbe an increasing function defined on [a, fr] and let x 0 , x u ... ,x n 
be n + 1 points such that 

a — x 0 < x k < x 2 < ' • * < x n = b. 

Then we have the inequality 

n— 1 

E [/(** + ) - /(** - )] S f(b) - f(a). 

*= 1 

Proof. Assume that y k e (x k , x k+i ). For 1 < k < n — 1, we have/(x t +) < f(y k ) 
and f{y k -i) </(**-), so that f(x k + ) - f(x k ~) < f(y k ) - f(y k -i). If we add 
these inequalities, the sum on the right telescopes to f(y n - 1 ) — f(y 0 ). Since 
f{y„- 1 ) - f(y 0 ) ^ fifi) - f(a), this completes the proof. 

The difference f{x k +) — f(x k —) is, of course, the jump of / at x k . The fore- 
going theorem tells us that for every finite collection of points x k in (a, b), the sum 
of the jumps at these points is always bounded by f(b) — f(a). This result can be 
used to prove the following theorem. 

Theorem 6.2. If f is monotonic on [a, b\ then the set of discontinuities of f is 
countable. 

Proof Assume that / is increasing and let S m be the set of points in (a, b) at which 
the jump of / exceeds 1/m, m > 0. If x t < x 2 < * • * < x„- t are in 5 m , Theorem 
6. 1 tells us that 

< m - f(a). 

m 
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Functions of Bounded Variation and 


Curves 


Def. 6.3 


This means that S m must be a finite set. But the set of discontinuities of / in (a, ft) 
is a subset of the union \J j S m and hence is countable. (If / is decreasing, the 
argument can be applied to — /.) 


6.3 FUNCTIONS OF BOUNDED VARIATION 


Definition 6.3 . If [a, ft] is a compact interval , a set of points 


satisfying the inequalities 


P {*0> ^l» • • • > 


a = jc 0 < x v • • • < x n - l < x n = ft, 

is called a partition of [a, ft]. The interval { , x fc ] is called the kth subinterval 
of P and we write Ax k = x k — x k _ 1 , so that XZ=i Ax k — b — a. The collection 
of all possible partitions of [a, ft] will be denoted by 2P[a, ft]. 

Definition 6.4. Let f be defined on [a, 6]. If P — {jc 0 , jc lf . . . , x„\ is a partition 
of [a, ft], writ& A f k = f(x k ) — f(x k - 1 ) i for k — 1, 2, . . . , n. If there exists a 
positive number M such that 

n 

X) |A/ t | < M 

k= 1 

for all partitions of [a, b~\, then f is said to be of bounded variation on [a, b~\. 

Examples of functions of bounded variation are provided by the next two 
theorems. 


Theorem 6.5. Iff is monotonic on [a, b\ then f is of bounded variation on [a, b\ 

Proof Let / be increasing. Then for every partition of [a, ft] we have A f k > 0 
and hence 


n n n 

X iaai = X 4/i = X [/(**) - /(**-i)] - m - /(*)• 

k= 1 k = 1 fc= 1 

Theorem 6.6. If f is continuous on [a, ft] and if /' exists and is bounded in the 
interior , say |/'(x)| < A for all x in (a, ft), then f is of bounded variation on [a, ft]. 

Proof Applying the Mean- Value Theorem, we have 

4A = f(x k ) - f(x k . ,) = f'(t k )(x k - x k _ ,), where t k € (x k ^ k , x k ). 

This implies 

h n n 

X I A A1 = X l/'0*)l Ax k < A Ax k = A(b - a). 

k = 1 k = 1 k = 1 

Theorem 6.7. If f is of bounded variation on [a, ft], say X |A/J < M for all par- 
titions of [a, ft], then f is bounded on [a, ft]. In fact , 

l/(x)| < \f(a)\ + M for all x in [ a , ft]. 
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Proof. Assume that x e (a, b). Using the special partition P = { a , jc, b }, we find 

I/M - /Ml + |/M - /Ml < M. 

This implies \f{x) - /Ml < M, |/MI £ I/Ml + M. The same inequality holds 
if x = a or x = b. 


Examples 

1. It is easy to construct a continuous function which is not of bounded variation. For 
example, let f{x) = x cos {n/(2x)} if x ^ 0,/(0) = 0. Then /is continuous on [0, 1], 
but if we consider the partition into 2 n subintervals 




J_ 1 

2n ’ 2n — 1 


I i i 
) ) 1 

3 2 




an easy calculation shows that we have 


2 n 


1 


v ia/ 4 i = — + ~ + — + _ 

^ 2n 2n 2n - 2 2n - 2 


k= 1 



rThis is not bounded for all «, since the series y ”_ 1 (1 /n) diverges. In this example 
the derivative/' exists in (0, 1 ) but/' is not bounded on (0, 1). However,/' is bounded 
on any compact interval not containing the origin and hence / will be of bounded 
variation on such an interval. 

2. An example similar to the first is given by fix) = x 2 cos (1 /jc) if ;t ^ 0, /( 0) = 0. 
This / is of bounded variation on [0, 1 ], since /' is bounded on [0, 1 ]. In fact, 
/'( 0) = 0 and, for x ^ o, f'(x) = sin (l lx) + 2x cos (1/*), so that \f'(x)\ < 3 for 
all x in [0, 1 ]. 

3. Boundedness of/' is not necessary for /to be of bounded variation. For example, let 
f(x) = x 1/3 . This function is monotonic (and hence of bounded variation) on every 
finite interval. However, f\x) -> + oo as x -* 0. 


6.4 TOTAL VARIATION 

Definition 6,8. Let f be of bounded variation on [a, b\ and let £ iP) denote the sum 

Zk=i I a/* I corresponding to the partition P = {x 0 , x u . . . , x n } of [a, b\ The 
number 

V f (a , b) = sup (2 (P) :Pe &[a, &]}, 
is called the total variation of f on the interval [a, &]. 

note. When there is no danger of misunderstanding, we will write V f instead of 
VJa, b ). 


Since /is of bounded variation on [ a , b\ the number V f is finite. Also, V f > 0, 
since each sum X! (P) > 0. Moreover, F/n, b) = 0 if, and only if, / is constant 
on [n, 6], 
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Theorem 6.9 . Assume that f and g are each of bounded variation on [a, b\ Then 
so are their sum , difference , and product . Also , we have 

Vf± g <V f +V g and V f . g < AV f + BV g , 

where 

A = sup {|£(x)| :xe [a, b~\), B = sup {|/(x)| : x e {a, b']}. 

Proof. Let h(x) = f(x)g(x). For every partition P of [a, b~\, we have 
|A/ 2 t | = \f(x k )g(x k ) - f(x k - 1 )g(x k .. 1 )\ 

= I [/(**)£(**) ~ f(x k - !)g(x k y] 

+ U( x k-i)d(x k ) - /(*k-i)£(**-i)]l < A\Af k \ + B\Ag k \. 

This implies that h is of bounded variation and that V h < AV f + BV g . The proofs 
for the sum and difference are simpler and will be omitted. 

note. Quotients were not included in the foregoing theorem because the reciprocal 
of a function of bounded variation need not be of bounded variation. For example, 
if f(x) -> 0 as x -> x 0 , then 1// will not be bounded on any interval containing x 0 
and (by Theorem 6.7) 1// cannot be of bounded variation on such an interval. To 
extend Theorem 6.9 to quotients, it suffices to exclude functions whose values 
become arbitrarily close to zero. 

Theorem 6.10. Let f be of bounded variation on [a, b\ and assume that f is bounded 
away from zero; that is, suppose that there exists a positive number m such that 
0 < m < \f(x)\ for all x in [a, 6]. Then g = 1// is also of bounded variation on 
[a, 6], and v g < V f lm 2 . 

Proof. 

IA/J 

m 2 

6.5 ADDITIVE PROPERTY OF TOTAL VARIATION 

In the last two theorems the interval \a, b~] was kept fixed and V f (a, b ) was con- 
sidered as a function of /. If we keep / fixed and study the total variation as a 
function of the interval [a, 6], we can prove the following additive property. 

Theorem 6.11. Let f be of bounded variation on [a, 6], and assume that c e {a, b ). 
Then f is of bounded variation on [a, c ] and on [c, b ] and we have 

V f (a, b ) = V f (a, c ) + V f (c, b). 

Proof We first prove that /is of bounded variation on [a, c ] and on [c, 6]. Let 
P x be a partition of [a, c ] and let P 2 be a partition of [c, b\ Then P 0 = P t u P 2 
is a partition of [a, b\ If X CP) denotes the sum X |A/*| corresponding to the 
partition P (of the appropriate interval), we can write 

E (Pi) + E (P 2 ) = E (Po) £ vfa *)- 




1 

1 


&fk 

f(x k ) 

1 

X 

*• 

1 


f(x k )f(x k - 1 ) 


( 1 ) 
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This shows that each sum Y (P t ) and Y (P 2 ) is bounded by Vfia, b) and this means 
that /is of bounded variation on [a, c] and on [ c , b\ From (1) we also obtain the 
inequality 

Vfia, c) + Vfic, b ) < Vfia, b), 

because of Theorem 1.15. 

To obtain the reverse inequality, let P = {x 0 , x„ . . . , jc„} e 2?\a, b ] and let 
P 0 = P u {c} be the (possibly new) partition obtained by adjoining the point c. 
If c e xj, then we have 

l/(**) ~/(**-i)l < l/(**) - f(c)\ + I f(c) -fix ,-,) |, 

and hence Y (P) < ]£ (P 0 ). Now the points of P 0 in [a, c] determine a partition 
P, of \a, c\ and those in [c, h] determine a partition P 2 of [c, b\ The corre- 
sponding sums for all these partitions are connected by the relation 

ZiP)<H (Po) = £ (Pi) + £ (Pi) < V f (a, c) + V f (c, b). 

Therefore, V f (a, c) + V f (c, b) is an upper bound for every sum Y, (P)- Since this 
cannot be smaller than the least upper bound, we must have 

V f (a, b) < V f (a, c) + Vj(c, b), 

and this completes the proof. 

6.6 TOTAL VARIATION ON [a, x ] AS A FUNCTION OF x 

Now we keep the function / and the left endpoint of the interval fixed and study 
the total variation as a function of the right endpoint. The additive property 
implies important consequences for this function. 

Theorem 6.12. Let f be of bounded variation on \a, £>]. Let V be defined on [a, 6] 
as follows: V(x) = V f (a, x) if a < x < b, V(a) = 0. Then: 

i) V is an increasing function on [a, 6]. 

ii) V.—f is an increasing function on [a, b\ 

Proof. If a < x < y < b, we can write V f (a, y) = V f (a, x) + V fix, y). This 
implies V(y) — V(x) = Vfix, y) > 0. Hence V(x) < V(y), and (i) holds. 

To prove (ii), let D(x) = V(x) — f(x) if x e [a, b~\. Then, if a < x < y < b, 
we have 

D(y) - D(x) = V(y) - V(x) - \_f(y) - f(x)] = Vfix, y) - \_f(y) - /(*)]. 
But from the definition of Vfix, y) it follows that we have 

f(y) - f(x) < Vfix, y). 

This means that D(y) — D(x) > 0, and (ii) holds. 

note. For some functions /, the total variation Vfia, x) can be expressed as an 
integral. (See Exercise 7.20.) 
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6.7 FUNCTIONS OF BOUNDED VARIATION EXPRESSED AS THE 
DIFFERENCE OF INCREASING FUNCTIONS 

The following simple and elegant characterization of functions of bounded varia- 
tion is a consequence of Theorem 6.12. 

Theorem 6.13. Let f be defined on [a, 6]. Then f is of bounded variation on [a, £] 
if, and only if, f can be expressed as the difference of two increasing functions. 

Proof If / is of bounded variation on [a, b], we can write f — V — D, where 
V is the function of Theorem 6.12 and D = V — f Both V and D are increasing 
functions on [a, b ]. 

The converse follows at once from Theorems 6.5 and 6.9. 

The representation of a function of bounded variation as a difference of two 
increasing functions is by no means unique. If / = /i — f 2 , where f and f 2 are 
increasing, we also have / = (f t + g) — (f 2 + g), where g is an arbitrary in- 
creasing function, and we get a new representation of /. If g is strictly increasing, 
the same will be true of f t + g and f 2 + g. Therefore, Theorem 6.13 also holds 
if “increasing” is replaced by “strictly increasing.” 


6.8 CONTINUOUS FUNCTIONS OF BOUNDED VARIATION 

Theorem 6.14. Let f be of bounded variation on [a, 6]. If xe (a, b\ let V(x) = 
Vffa, x) and put V(a) = 0. Then every point of continuity of f is also a point of 
continuity of V. The converse is also true. 


Proof. Since V is monotonic, the right- and lefthand limits F(x+) and V(x—) 
exist for each point x in (a, b). Because of Theorem 6.13, the same is true of 
/(*+) and /(*—). 

If a < x < y < b, then we have [by definition of V f (x, y)] 

o < I Ay) - /Ml < v(y) - v(x). 

Letting y -*■ x, we find 

0 < |/(x+) - /(x)| < V(x+) - V(x). 


Similarly, 0 < |/(x) — f(x— )| < V(x) — V(x—). These inequalities imply that 
a point of continuity of V is also a point of continuity of /. 

To prove the converse, let /be continuous at the point c in (a, b). Then, given 
e > 0, there exists a 8 > 0 such that 0 < |x — c\ < 5 implies |/(x) — /(c) | < e/2. 
For this same e, there also exists a partition P of [c, 6], say 


P {*o> • • • > 



n 


Vfic, b)-?<Z |A/|. 

2 k= 1 


such that 
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Adding more points to P can only increase the sum £ |A_4| and hence we can assume 
that 0 < Xj — x 0 < 8. This means that 

|A/ 1 | = |/(x 1 )-/(c)|<^, 
and the foregoing inequality now becomes 

V f (c, ft) - \ < l + £ |A/,| <- + V/x lf b), 

2 2 k=2 2 

since {x l9 x 2 , ■ ■ ■ , x„} is a partition of [x,, ft]. We therefore have 

V f (c, b) - V f {x x , b) < e. 

But 

0 < V(x t ) - V(c) = VJa, Xl ) - Vj{a, c) 

= V f (c, Xj) = V f (c, b) - V f (x u b) < e. 

Hence we have shown that 

0 < x x — c < 5 implies 0 < F(xj) — V(c) < e. 

This proves that V(c+ ) = V(c). A similar argument yields V(c—) = F(c). The 
theorem is therefore proved for all interior points of [a, £>]. (Trivial modifications 
are needed for the endpoints.) 

Combining Theorem 6.14 with 6.13, we can state 

Theorem 6.15. Let f be continuous on [a, 6]. Then f is of bounded variation on 
[a, 6] if, and only if, f can be expressed as the difference of two increasing continuous 
functions. 

note. The theorem also holds if “increasing” is replaced by “strictly increasing.” 

Of course, discontinuities (if any) of a function of bounded variation must 
be jump discontinuities because of Theorem 6.13. Moreover, Theorem 6.2 tells us 
that they form a countable set. 


6.9 CURVES AND PATHS 

Let f : [a, 6] -> R" be a vector-valued function, continuous on a compact interval 
[a, 6] in R. As t runs through [a, ft], the function values f(t) trace out a set of 
points in R" called the graph of f or the curve described by f. A curve is a compact 
and connected subset of R" since it is the continuous image of a compact interval. 
The function f itself is called a path. 

It is often helpful to imagine a curve as being traced out by a moving particle. 
The interval [a, ft] is thought of as a time interval and the vector f(t) specifies the 
position of the particle at time t. In this interpretation, the function f itself is 
called a motion. 
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Different paths can trace out the same curve. For example, the two complex- 
valued functions 

f(t) = e 2 * u , g(t ) = e~ 2 * u , 0 <: t £ 1, 

each trace out the unit circle x 2 + y 2 = 1, but the points are visited in opposite 
directions. The same circle is traced out five times by the function h(t) = e 10nit , 
0 ^ t <, 1. 

6.10 RECTIFIABLE PATHS AND ARC LENGTH 

Next we introduce the concept of arc length of a curve. The idea is to approximate 
the curve by inscribed polygons, a technique learned from ancient geometers. Our 
intuition tells us that the length of any inscribed polygon should not exceed that 
of the curve (since a straight line is the shortest path between two points), so the 
length of a curve should be an upper bound to the lengths of all inscribed polygons. 
Therefore, it seems natural to define the length of a curve to be the least upper 
bound of the lengths of all possible inscribed polygons. 

For most curves that arise in practice, this gives a useful definition of arc 
length. However, as we will see presently, there are curves for which there is no 
upper bound to the lengths of the inscribed polygons. Therefore, it becomes 
necessary to classify curves into two categories: those which have a length, and 
those which do not. The former are called rectifiable, the latter nonrectifiable. 

We now turn to a formal description of these ideas. 

Let f : [a, 6] -+ R" be a path in R". For any partition of [a, b~\ given by 

the points f(f 0 )> f(ti), ...» f(f m ) are the vertices of an inscribed polygon. (An 
example is shown in Fig. 6.1.) The length of this polygon is denoted by A ,(/*) and 
is defined to be the sum 

m 

A f (P) = £ l|f(0 - f(f k -i)ll- 
1 

Definition 6.16 . If the set of numbers A t (P) is bounded for all partitions P of [a, b\ 
then the path f is said to be rectifiable and its arc length , denoted by A f (a, b\ is 


f(U) 



Figure 6.1 
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defined by the equation 

A f (a, b ) = sup (A,(P) : P e ^[a, Z>]}. 

If the set of numbers A t (P) is unbounded, f is called nonrectifiable. 


It is an easy matter to characterize all rectifiable curves. 


Theorem 6.17. Consider a path f : [a, £>] -*• R" with components f = (f u . . . ,/„). 
Then f is rectifiable if, and only if, each component f k is of bounded variation on 
[a, b~\. If f is rectifiable, we have the inequalities 

V k (a, b) < A f (a, b) < F,(a, b) + • • • + V„(a, b), (k = 1,2, , n), (2) 


where V k (a, b) denotes the total variation of f k on \a, b\ 

Proof If P = {f 0 , t k , . . . , t m } is a partition of [a, b~\ we have 


m 


m 


E ^ AAP) <EE \fj(ti) -fj(ti-i)\, 

i=i j=i 


i = 1 


for each k. All assertions of the theorem follow easily from (3). 



Examples 

1. As noted earlier, the function given by f(x) = x cos {n/( 2x)} for x ^ 0, /( 0) = 0, 
is continuous but not of bounded variation on [0, 1 ]. Therefore its graph is a non- 
rectifiable curve. 

2. It can be shown (Exercise 7.21) that if f ' is continuous on [a, b], then f is rectifiable 
and its arc length can be expressed as an integral, 

A, (a, b) = f ||f '(Oil dt. 


6.11 ADDITIVE AND CONTINUITY PROPERTIES OF ARC LENGTH 

Let f = (/„... ,/„) be a rectifiable path defined on \a, 6]. Then each component 
f k is of bounded variation on every subinterval [x, y] of [a, 6]. In this section we 
keep f fixed and study the arc length A,(x, y) as a function of the interval [x, y]. 
First we prove an additive property. 

Theorem 6.18. If c e (a, b) we have 

A f (a, b ) = A, (a, c ) + A f (c, b). 

Proof Adjoining the point c to a partition P of [a, b], we get a partition P k of 
\a, c] and a partition P 2 of [c, b~] such that 

A,(P) < A ,(Pj) + Af(P 2 ) < A, (a, c ) + A f (c, b). 

This implies A, (a, b) < A, (a, c) + A f (c, b). To obtain the reverse inequality, let 
P k and P 2 be.arbitrary partitions of [a, c ] and [c, b~\, respectively. Then 


P = Pi (J P 2 , 
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is a partition of [a, b~] for which we have 

Af(P 1 ) + A 1 CP 2 ) = A f (P) < A f (a, b). 

Since the supremum of all sums A ^P t ) + A ,(P 2 ) is the sum A f (a, c) + A f (c, b) 
(see Theorem 1.15), the theorem follows. 

Theorem 6.19. Consider a rectifiable path f defined on [a, 6]. If xe {a, 6], let 
s(x) = Af (a, x ) and let s(a ) = 0. Then we have: 

i) The function s so defined is increasing and continuous on \a, 6]. 

ii) If there is no subinterval of [a, b~\ on which f is constant, then s is strictly in- 
creasing on [a, b~\. 

Proof If a <, x < y <, b, Theorem 6.18 implies s(y) — s(x) = A f (x, y) > 0. 
This proves that s is increasing on [a, b~\. Furthermore, we have s(y) — s(x) > 0 
unless A,(x, y) = 0. But, by inequality (2), A f (x, y) = 0 implies V k (x, y) = 0 for 
each k and this, in turn, implies that f is constant on [x, y]. Hence (ii) holds. 

To prove that s is continuous, we use inequality (2) again to write 

n 

0 < s(y) - s(x) = A t (x, y) < X **(*> y)- 

*= 1 

If we let y -*• x, we find each term V k (x, y) -► 0 and hence s(x) = j(x+). Similarly, 
s(x) = s(x—) and the proof is complete. 

6.12 EQUIVALENCE OF PATHS. CHANGE OF PARAMETER 

. This section describes a class of paths having the same graph. Let f : [a, b~\ -*• R" 
be a path in R". Let u : [c, d~\ -*■ [a, b~] be a real-valued function, continuous and 
strictly monotonic on [c, d~\ with range \a, b~\. Then the composite function 
g = f ° u given by 

g(0 = f[w(0] iov c < t < d, 

is a path having the same graph as f. Two paths f and g so related are called 
equivalent. They are said to provide different parametric representations of the 
same curve. The function u is said to define a change of parameter. 

Let C denote the common graph of two equivalent paths f and g. If u is 
strictly increasing, we say that f and g trace out C in the same direction. If u is 
strictly decreasing, we say that f and g trace out C in opposite directions. In the 
first case, u is said to be orientation-preserving ; in the second case, orientation- 
reversing. 

Theorem 6.20. Let f : [a, A] — > R" and g : [c, d~\ — > R" be two paths in R", each 
of which is one-to-one on its domain. Then f and g are equivalent if, and only if, they 
have the same graph. 

Proof. Equivalent paths necessarily have the same graph. To prove the converse, 
assume that f and g have the same graph. Since f is one-to-one and continuous on 
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the compact set [a, b\ Theorem 4.29 tells us that f -1 exists and is continuous on 
its graph. Define u(t) = f -1 [g(0] if t e [c, d\ Then u is continuous on [c, <f\ 
and g(t) = f [«(#)]• The reader can verify that u is strictly monotonic, and hence 
f and g are equivalent paths. 

EXERCISES 


Functions of bounded variation 

6.1 Determine which of the following functions are of bounded variation on [0, 1]. 

a) fix) = x 2 sin (1 /jc) if x # 0,/(0) = 0. 

b) fix) = V* sin (1 /jc) if x # 0,/(0) = 0. 

6.2 A function /, defined on [a, b], is said to satisfy a uniform Lipschitz condition of 
order a > 0 on [a, b ] if there exists a constant M > 0 such that | f(x) — f(y)\ < 
M\x — y\ a for all x and y in [a, b], (Compare with Exercise 5.1.) 

a) If /is such a function, show that a > 1 implies /is constant on [a, b], whereas 
a = 1 implies /is of bounded variation [< a , b\ 

b) Give an example of a function / satisfying a uniform Lipschitz condition of order 
a < 1 on [a, b ] such that /is not of bounded variation on [a, b ]. 

c) Give an example of a function / which is of bounded variation on [< a , b] but 
which satisfies no uniform Lipschitz condition on [a, b], 

6.3 Show that a polynomial /is of bounded variation on every compact interval [a, b]. 
Describe a method for finding the total variation of / on [a, b] if the zeros of the derivative 
/' are known. 

6.4 A nonempty set S of real-valued functions defined on an interval [a, b] is called a 
linear space of functions if it has the following two properties: 

a) If /e S, then c/e S for every real number c. 

b) ~If /e S and g e 5, then / + g e S. 

Theorem 6.9 shows that the set V of all functions of bounded variation on [a, b] is a linear 
space. If S is any linear space which contains all monotonic functions on [a, b], prove 
that V £ 5. This can be described by saying that the functions of bounded variation 
form the smallest linear space containing all monotonic functions. 

6.5 Let / be a real-valued function defined on [0, 1 ] such that /( 0) > 0, f(x) ^ x for 
all jc, and f(x) < f(y) whenever x < y. Let A — {x :/( x) > x}. Prove that sup A e A 
and that /(l) > 1. 

6.6 If / is defined everywhere in R 1 , then / is said to be of bounded variation on 
(— oo, + oo) if /is of bounded variation on every finite interval and if there exists a positive 
number M such that V f (a, b) < M for all compact intervals [a, b]. The total variation of 
/ on (— oo, + oo) is then defined to be the sup of all numbers V f (a, b), — oo < a < b < 
+ oo, and is denoted by V f (— oo, +oo). Similar definitions apply to half-open infinite 
intervals [a, + oo) and (— oo, b\ 

a) State and prove theorems for the infinite interval (— oo, +oo) analogous to 
Theorems 6.7, 6.9, 6.10, 6.11, and 6.12. 


138 


Functions of Bounded Variation and Rectifiable Curves 


b) Show that Theorem 6.5 is true for (- 00 , + 00 ) if “monotonic” is replaced by 
“bounded and monotonic.” State and prove a similar modification of Theorem 
6.13. 

6.7 Assume that / is of bounded variation on [a, 6] and let 

P = {*„, x u ..., x„ } e 0>\a, 6]. 

As usual, write Af k = f(x k ) - /(**_ x ), k = 1, 2 Define 

A(P) = {k : Af k > 0}, B(P) = {k : Af k < 0}. 

The numbers 

P/(a, b) = sup / ^ A/ k :P6^[a,6] 

\keAiP) 

and 

£) = sup ( 23 *] 

UeB(P) 

are called, respectively, the positive and negative variations of / on [a, 6], For each x in 
(a, b] 9 let V(x) = V f (a, x) 9 p(x) = p f (a, x) 9 n(x) = n f (a 9 x) 9 and let V(a) = p(a) = 
n(a) = 0. Show that we have: 

a) V(x) = p(x) + n{x). 

b) 0 < p(x) < V(x) and 0 < n(x) < V(x). 

c) p and n are increasing on [a 9 b]. 

d) f(x) = f(a) + p(x) - n{x). Part (d) gives an alternative proof of Theorem 6.13. 

e) 2 p(x) = V(x) + f(x) - f(a) 9 2n(x) = V(x) - f(x) + f(a). 

f) Every point of continuity of /is also a point of continuity of p and of n. 

Curves 

6.8 Let /and g be complex-valued functions defined as follows: 

fit) = if / e [0, 1 ], git) = if f e [0, 2]. 

a) Prove that /and g have the same graph but are not equivalent according to the 
definition in Section 6.12. 

b) Prove that the length of g is twice that of / 

6.9 Let f be a rectifiable path of length L defined on [a 9 b] 9 and assume that f is not 
constant on any subinterval of [a 9 b]. Let s denote the arc-length function given by 

= A f (tf, x) if a < x < b 9 sia) = 0. 

a) Prove that j” 1 exists and is continuous on [0, L\. 

b) Define g(f) = f [j“ 1 (/)] if f e [0, L] and show that g is equivalent to f. Since 
f(0 = g[K0]> the function g is said to provide a representation of the graph of f 
with arc length as parameter. 

6.10 Let / and g be two real-valued continuous functions of bounded variation defined 
on [a 9 b] 9 with 0 < fix) < gix) for each x in (a, b) 9 fia) — gia) 9 fib) = gib). Let h be 
the complex-valued function defined on the interval [a 9 2b — a] as follows: 

hit) = f + ifit) 9 if a < t < b 9 

hit) — 2 b — t + igi2b — t) 9 if b < t < 2b — a. 
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a) Show that h describes a rectifiable curve r. 

b) Explain, by means of a sketch, the geometric relationship between / g, and h. 

c) Show that the set of points 

S = {(*, y):a < x < b, fix) < y < g(x)} 

is a region in R 2 whose boundary is the curve T. 

d) Let H be the complex-valued function defined on [a, 2b - a ] as follows: 

H(t) = t - \i [git) - fit)}, if a < t < b, 

H{t) = t + $i[g(2b - t) - f(2b - /)], if b < t < 2b - a. 

Show that H describes a rectifiable curve T 0 which is the boundary of the region 

-So = {(*, y):a < x < b, fix) - gix) <2 y < gix) - fix)}. 

e) Show that S 0 has the x-axis as a line of symmetry. (The region S 0 is called the 
symmetrization of S with respect to the x-axis.) 

f) Show that the length of T 0 does not exceed the length of F. 

Absolutely continuous functions 

A real-valued function / defined on [a, b] is said to be absolutely continuous on [o, b] if 
for every e > 0 there is a <5 > 0 such that 

n 

X) I /(« - /(**) I < e 

k= 1 

for every n disjoint open subintervals (a*, bf) of [a, b\ n = 1 , 2 ,..., the sum of whose 
lengths £2=i ib k - a k ) is less than <5. 

Absolutely continuous functions occur in the Lebesgue theory of integration and 
differentiation. The following exercises give some of their elementary properties. 

6.11 Prove that every absolutely continuous function on [a, b ] is continuous and of 
bounded variation on [a, b ]. 

note. There exist functions which are continuous and of bounded variation but not 
absolutely continuous. 

6.12 Prove that /is absolutely continuous if it satisfies a uniform Lipschitz condition of 
order 1 on [a, b], (See Exercise 6.2.) 

6.13 If /and g are absolutely continuous on [a, b], prove that each of the following is 
also: |/|, cf (c constant),/ + g,f- g; also fjg if g is bounded away from zero. 
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CHAPTER 7 


THE RIEMANN-STIELTJES INTEGRAL 


7.1 INTRODUCTION 


Calculus deals principally with two geometric problems: finding the tangent line 
to a curve, and finding the area of a region under a curve. The first is studied by a 
limi t process known as differentiation ; the second by another limit process — 
integration — to which we turn now. 

The reader will recall from elementary calculus that to find the area of the 
region under the graph of a positive function f defined on [a, fi], we subdivide 
the interval [a, b~\ into a finite number of subintervals, say n, the fcth subinterval 
having length Ax*, and we consider sums of the form £* =1 f(t k ) Ax*, where f* is 
some point in the kth subinterval. Such a sum is an approximation to the area by 
means of rectangles. If / is sufficiently well behaved in [a, b ] — continuous, for 
example — then there is some hope that these sums will tend to a limit as we let 
n -*■ co, making the successive subdivisions finer and finer. This, roughly speaking, 
is what is involved in Riemann’s definition of the definite integral f£/(x) dx. (A 


precise definition is given below.) 

The two concepts, derivative and integral, arise in entirely different ways and 
it is a remarkable fact indeed that the two are intimately connected. If we consider 
the definite integral of a continuous function / as a function of its upper limit, 
say we write 




fit) dt. 


then F has a derivative and F'(x) = /(x). This important result shows that 
differentiation and integration are, in a sense, inverse operations. 

In this chapter we study the process of integration in some detail. Actually 
we consider a more general concept than that of Riemann: this is the Riemann- 
Stieltjes integral , which involves two functions / and a. The symbol for such an 
integral is J‘/(x) du(x), or something similar, and the usual Riemann integral 
occurs as the special case in which a(x) = x. When a has a continuous derivative, 
the definition is such that the Stieltjes integral fix) da{x) becomes the Riemann 
integral $/(x) <x'(x) dx. However, the Stieltjes integral still makes sense when a 
is not differentiable or even when a is discontinuous. In fact, it is in dealing with 
discontinuous a that the importance of the Stieltjes integral becomes apparent. By 
a suitable choice of a discontinuous a, any finite or infinite sum can be expressed 
as a Stieltjes integral, and summation and ordinary Riemann integration then 


140 



Def. 7.1 


Definition of Riemann-Stieltjes Integral 


141 


become special cases of this more general process. Problems in physics which 
involve mass distributions that are partly discrete and partly continuous can also 
be treated by using Stieltjes integrals. In the mathematical theory of probability 
this integral is a very useful tool that makes possible the simultaneous treatment 
of continuous and discrete random variables. 

In Chapter 10 we discuss another generalization of the Riemann integral 
known as the Lebesgue integral. 

7.2 NOTATION 

For brevity we make certain stipulations concerning notation and terminology to 
be used in this chapter. We shall be working with a compact interval [a, 6] and, 
unless otherwise stated, all functions denoted by /, g, a, /?, etc., will be assumed to 
be real-valued functions defined and bounded on [a, b\ Complex-valued functions 
are dealt with in Section 7.27, and extensions to unbounded functions and infini te 
intervals will be discussed in Chapter 10. 

As in Chapter 6, a partition P of [a, 6] is a finite set of points, say 

P ^19 * * * 9 9 

such that a = x 0 < x t < • • • < x„_j < x„ = b. A partition P' of [a, 6] is said 
to be finer than P (or a refinement of P) if P s P', which we also write P' 2 P. 
The symbol Aa k denotes the difference Aa k = a(x k ) — a(x k _j), so that 

n 

E Aa* = a (b) - a(a). 

*=i 

The set of all possible partitions of [a, 6] is denoted by &\a, b~\. 

The norm of a partition P is the length Of the largest subinterval of P and is 
denoted by ||P||. Note that 

implies ||P'|| < ||P||. 

That is, refinement of a partition decreases its norm, but the converse does not 
necessarily hold. 

73 THE DEFINITION OF THE RIEMANN-STIELTJES INTEGRAL 

Definition 7.1. Let P = {x 0 , x lt ... , x„} be a partition of [a, P>] and let t k be a 
point in the subinterval [x k _ l5 x k ], A sum of the form 

n 

S(P, f, a) = 'Em Aa k 

k= 1 

is called a Riemann-Stieltjes sum of f with respect to a. We say f is Riemann- 
integrable with respect to a on [a, 6], and we write “f e R( a) on [a, b ~\ ” if there 
exists a number A having the following property: For every e > 0, there exists a 
partition P c of [a, 6] such that for every partition P finer than P t and for every 
choice of the points t k in [x k _ l5 x k ], we have \S(P,f a) — A\ < e. 
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When such a number A exists, it is uniquely determined and is denoted by 
\ b a f da or by J b a f(x) da(x). We also say that the Riemann-Stieltjes integral \ b a f da 
exists. The functions / and a are referred to as the integrand and the integrator, 
respectively. In the special case when a(x) = x, we write S(P,f) instead of 
S(P,f, a), and f e R instead of / e R( a). The integral is then called a Riemann 
integral and is denoted by / dx or by fix) dx. The numerical value of 
J* fix) daix) depends only on /, a, a, and b, and does not depend on the symbol x. 
The letter x is a “dummy variable” and may be replaced by any other convenient 
symbol. 

note. This is one of several accepted definitions of the Riemann-Stieltjes integral. 
An alternative (but not equivalent) definition is stated in Exercise 7.3. 


7.4 LINEAR PROPERTIES 


It is an easy matter to prove that the integral operates in a linear fashion on both 
the integrand and the integrator. This is the context of the next two theorems. 

Theorem 7.2. If f e Ri a) and if g e Rig) on [a, 6], then c x f + c 2 g e /?( a) on 
[a, 6] ifor any two constants c x and c 2 ) and we have 

rb rb rb 

icj + c 2 g) da = c x f da + c 2 \ g da. 

Ja Ja Ja 

Proof. Let h = c x f + c 2 g. Given a partition P of [a, 6], we can write 


R 


R 


R 


S(P, h, a) = J2 h(t k ) Aa k = c t 2 /(fc) Aa * + 9(h) Aa , 


fc= 1 


k=l 


fc= 1 


= c x SiP,f a) + c 2 S(P, g, a). 

Given e > 0, choose P' so that P2P' implies |S(P,/, a) — j b f da\ < e, and 
choose P" so that P 2 P" implies | S(P, g, a) — f b g da\ < e. If we take 
P £ = P' e u P", then, for P finer than P £ , we have 


SiP, h, a) 


rb 


f da - c 2 


g da 


< |c x |e + |c 2 |e, 


and this proves the theorem. 


Theorem 7.3. If f e P(a) and f e R(fi) on [a, 6], then f e P(c t a + c 2 f) on [a, 6] 
ifor any two constants c x and c 2 ) and we have 

* b rb rb 

f dic x a + c 2 fi) = c x I f da + c 2 f dp. 

Ja Ja Ja 

The proof is similar to that of Theorem 7.2 and is left as an exercise. 

A result somewhat analogous to the previous two theorems tells us that the 
integral is also additive with respect to the interval of integration. 
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Theorem 7.4. Assume that c e (a, b). If two of the three integrals in (1) exist, then 
the third also exists and we have 



f da. + 





Proof If P is a partition of [a, b~\ such that c e P, let 

P' = P n [a, c] and P" = P n [c, b], 

denote the corresponding partitions of [a, c] and [c, 6], respectively. The Rie- 
mann-Stieltjes sums for these partitions are connected by the equation 


S(P,f a) = S(P',f a) + S(P",f a). 


Assume that fcfda and \ b f da exist. Then, given e > 0, there is a partition 
P' c of [a, c] such that 


S(P 


f 

Ja 


f da 


e 

< - 
2 


and a partition P" t of [c, b] such that 
S{P\f, a) - f / da < - 

J, 2 


whenever P' is finer than P' t , 


whenever P" is finer than P" t . 


Then P t = P' u P" t is a partition of [a, 6] such that P finer than P t implies 
P' 2 P't and P" 2 P”. Hence, if P is finer than P s , we can combine the foregoing 
results to obtain the inequality 


S(P, f, a) 



< e. 


This proves that \ b a f da exists and equals \ c a f da + \ b f da. The reader can easily 
verify that a similar argument proves the theorem in the remaining cases. 


Using mathematical induction, we can prove a similar result for a decomposi- 
tion of [a, 6] into a finite number of subintervals. 


note. The preceding type of argument cannot be used to prove that the integral 
Ja / da exists whenever J */ da exists. The conclusion is correct, however. For 
integrators a of bounded variation, this fact will later be proved in Theorem 7.25. 

Definition 7.5. If a < b, we define \lf da = —\ b f da whenever j b f da exists. 
We also define \ a a f da = 0. 

The equation in Theorem 7.4 can now be written as follows : 

fb fic Pa 

f da + I f da + f da = 0. 

Ja Jb Jc 
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7.5 INTEGRATION BY PARTS 


A remarkable connection exists between the integrand and the integrator in a 
Riemann-Stieltjes integral. The existence of j b f da implies the existence of 
a df, and the converse is also true. Moreover, a very simple relation holds 
between the two integrals. 


Theorem 7.6. Iff e R( a) on [a, 6], then a e R(f) on [a, and we have 

j* f(x) da(x) + f a(x) df(x) = f{b)a{b) - /(a)a(a). 

Ja Ja 

note. This equation, which provides a kind of reciprocity law for the integral, is 
known as the formula for integration by parts. 

Proof. Let e > 0 be given. Since JJJ/ da. exists, there is a partition P t of [a, 6] 
such that for every P' finer than P e , we have 


S(P 


',f , «) - J‘ 


f da 


< e. 


( 2 ) 


Consider an arbitrary Riemann-Stieltjes sum for the integral J2 a df say 


n 


n 


S(P, a ,/) = ^ a (t k ) A f k = ^ a (t k )f(x k ) - ^ a(< t )/(^-i), 

k=l k = 1 k = 1 

where P is finer than P c . Writing A = f(b)a(b) — /(a)a(a), we have the identity 


it it 

A = Xi f( x k)a(Xk) - X/(**-i) a ( x *-i)- 

k= 1 k = 1 

Subtracting the last two displayed equations, we find 

It it 

A - S(P, a,/) = S /(x*)[a(x*) - a(f*)] + i)C a (^) ~ a ( x k-i)]- 

k=l k= 1 

The two sums on the right can be combined into a single sum of the form S(P',f a), 
where P’ is that partition of [a, 6] obtained by taking the points x k and t k together. 
Then P' is finer than P and hence finer than P c . Therefore the inequality (2) is 
valid and this means that we have 



S(P, a, /) 


t'b 


f da 


< e, 


whenever P is finer than P c . But this is exactly the statement that a df exists 
and equals A — ft f da. 


7.6 CHANGE OF VARIABLE IN A RIEMANN-STIELTJES INTEGRAL 

Theorem 7.7. - Let f e R( a) on \a, 6] and let g be a strictly monotonic continuous 
function defined on an interval S having endpoints c and d. Assume that a = g(c). 



Th. 7.7 


Reduction to a Riemann Integral 


145 


b = g(d). Let h and /? be the composite functions defined as follows : 


Kx) = /IX*)], P( x ) - «[>(*)], if x e S. 


T hen h e R(fi) on S and we have f*f da. = h dfi. That is, 

raid) I'd 


f{t) dept) - 


9(c) 


/X*)] «*{«IX*)]}. 


Proof. For definiteness, assume that g is strictly increasing on S. (This implies 
c < d.) Then g is one-to-one and has a strictly increasing, continuous inverse g~ x 
defined on [a, b\ Therefore, for every partition P = (y 0 , . . . , y„} of [c, d~\, 
there corresponds one and only one partition P’ = (jc 0 , of [a, 6] with 

x k = g(y k ). In fact, we can write 


P' = g(P) and P = g~\P'). 

Furthermore, a refinement of P produces a corresponding refinement of P', and 
the converse also holds. 

If e > 0 is given, there is a partition P' c of [a, P] such that P' finer than P' 
implies \S(P',f, a) - \ b a fda\ < e. Let P c = g~\P' c ) be the corresponding par- 
tition of [c, d], and let P = {y 0 , . . . , y n } be a partition of [c, d~\ finer than P e . 
Form a Riemann-Stieltjes sum 

n 

S(P, h,P)=L Ku k ) Aft, 

*=1 

where u k e [^-x, y*] and A/?* = P(y k ) - 0(y*_,). If we put t k = g(u k ) and 
x k = g(y k ), then P' = {x 0 , . . . , x„} is a partition of [a, 6] finer than P'. Moreover, 
we then have 


S(P, h, P) = 2 /[?("*)] ~ a 0O*-i)]} 

k= 1 
n 

= ]£/('*){“(**) - “(**-i)} = S(P',f, a), 

k=l 

since t k e [x t _ l5 **]. Therefore, \S(P, h, P) - \ b a f dct\ < e and the theorem is 
proved. 

note. This theorem applies, in particular, to Riemann integrals, that is, when 
x(x) = x. Another theorem of this type, in which g is not required to be mono- 
tonic, will later be proved for Riemann integrals. (See Theorem 7.36.) 


7.7 REDUCTION TO A RIEMANN INTEGRAL 

The next theorem tells us that we are permitted to replace the symbol da(x) by 
a'(*) dx in the integral fa/(*) dx(x) whenever a has a continuous derivative a'. 



146 


The Riemann-Stieltjes Integral 


Th. 7.8 


Theorem 7.8. Assume f e R( a) on [a, b~\ and assume that a has a continuous 
derivative a' on [a, b]. Then the Riemann integral /(x)a'(x) dx exists and we have 

[b [b 

I /(x) da(x) = I /(x)a'(x) dx. 

Ja Ja 

Proof. Let g(x) = /(x)a'(x) and consider a Riemann sum 

W W 

S(P, g) = 23 9( { k) Ax k = S /(**)«'(*») Ax k- 

k=l k = 1 

The same partition P and the same choice of the t k can be used to form the 
Riemann-Stieltjes sum 

n 

S(P, /, a) = 2 f(t k ) A a k . 

k= 1 

Applying the Mean-Value Theorem, we can write 

Aa* = a'(i>*) Ax*, where v k e (x*_ 1; x k ), 

and hence 


S(P, f, a) - S(P, g) = f{t k )\*'{v k ) - a '(**)] Ax*. 

* = i 

Since f is bounded, we have |/(x)| < M for all x in [a, 6], where M > 0. Con- 
tinuity of a' on [a, 6] implies uniform continuity on [a, b~\. Hence, if s > 0 is 
given, there exists a S > 0 (depending only on e) such that 

0 < |x - y\ < 6 implies |oc'(x) - a'(y)l < — — ^ r . 

If we take a partition P' with norm ||P'|| < 8, then for any finer partition P we 
will have |a'(t>*) — a'(**)| < e/[2M(b — a)] in the preceding equation. For such 
P we therefore have 


I S(P,f, a) - S(P, g) | < | . 


On the other hand, since / e R(a) on [a, b\ there exists a partition P" such that 
P finer than P" implies 


S(P, /, a) 



< 


e 


2 


Combining the last two inequalities, we see that when P is finer than P e = P' u P", 
we will have |S(P, g) — J*/ d<x\ < e, and this proves the theorem. 


note. A stronger result not requiring continuity of a' is proved in Theorem 7.35. 
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7.8 STEP FUNCTIONS AS INTEGRATORS 

If a is constant throughout [a, 6], the integral JJ/ da exists and has the value 0, 
since each sum S(P, f a) = 0. However, if a is constant except for a jump dis- 
continuity at one point, the integral ftf da need not exist and, if it does exist, its 
value need not be zero. The situation is described more fully in the following 
theorem : 


Theorem 7.9. Given a < c < b. Define a on [a, b ] as follows: The values a (a), 
a (c), a (b) are arbitrary; 


and 


a(x) = a (a) if a < x < c, 

a(x) = a (b) if c < x < b. 


Let f be defined on [a, b~\ in such a way that at least one of the functions f or a is 
continuous from the left at c and at least one is continuous from the right at c. Then 
f e R(a) on [a, b] and we have 

f fda =f(c)[a(c+) - a(c— )]. 

note. The result also holds if c = a, provided that we write a(c) for a(c— ), and 
it holds for c = b if we write a (c) for a(c+). We will prove later (Theorem 7.29) 
that the integral does not exist if both / and a are discontinuous from the right or 
from the left at c. 


Proof If c e P, every term in the sum S(P, f a) is zero except the two terms arising 
from the subinterval separated by c, say 

S(P,fa) = f(t k ~i)[a(c) - a(c — )] + /(O0( c +) - a(c)], 

where t k _ t < c < t k . This equation can also be written as follows: 

A = [/('*- 1 ) - /(c)] 0(c) - a(c-)] + [/(O ~/(c)][a(c+) - o(c)], 

where A = S(P,f a) — /(c)[a(c+) — a(c — )]. Hence we have 

|A| < |/(f*_i) -fid) | |a(c) - a(c— )| + |/(/,) - /(c) | |a(c+) - a(c)|. 

If /is continuous at c, for every e > 0 there is a 5 > 0 such that ||P|| < 5 implies 

l/('*-i) ~ f(c)\ < e and 1/(0 - f(c)\ < e. 

In this case, we obtain the inequality 

|A| < e|a(c) — a(c— )| + e|a(c+) — a(c)|. 

But this inequality holds whether or not / is continuous at c. For example, if / is 
discontinuous both from the right and from the left at c, then a(c) = a(c— ) and 
a (c) = a(c + ) and we get A = 0. On the other hand, if / is continuous from the 
left and discontinuous from the right at c, we must have a(c) = a(c + ) and we get 
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|A| < e|a(c) — a(c — )|. Similarly, if / is. continuous from the right and discon- 
tinuous from the left at c, we have a(c) = a(c-) and |A| < e|a(c+) - a(c)|. 
Hence the last displayed inequality holds in every case. This proves the theorem. 

Example. Theorem 7.9 tells us that the value of a Riemann-Stieltjes integral can be altered 
by changing the value of / at a single point. The following example shows that the 
existence of the integral can also be affected by such a change. Let 

a(;c) = 0, if x ^ 0, a(0) = — 1 , 

f(x) =1, if — 1 < jc < + 1 . 

In this case Theorem 7.9 implies JL j f da = 0. But if we re-define / so that /( 0) = 2 and 

f(x) = 1 if x ^ 0, we can easily see that Jl t f da will not exist. In fact, when P is a par- 

tition which includes 0 as a point of subdivision, we find 

S(P,f <x) = f(t k )[oc(x k ) - a(0)] + /(/ fc -i)[a(0) - <*(**- 2 )] 

= m -nt k . 1 \ 

where x k __ 2 ^ — 0 < t k < x k . The value of this sum is 0, 1, or — 1, depending on 

the choice of t k and t k _ x . Hence, f da does not exist in this case. However, in a 
Riemann integral f(x) dx, the values of / can be changed at a finite number of points 
without affecting either the existence or the value of the integral. To prove this, it suffices 
to consider the case where f(x) = 0 for all x in [a, b] except for one point, say x = c. 
But for such a function it is obvious that |5'(/ > ,/)| < |/(c)| \\P\\. Since ||/ > || can be made 
arbitrarily small, it follows that $ b a f(x) dx — 0. 


7.9 REDUCTION OF A RIEMANN-STIELTJES INTEGRAL TO A FINITE SUM 

The integrator a in Theorem 7.9 is a special case of an important class of functions 
known as step functions. These are functions which are constant throughout an 
interval except for a finite number of jump discontinuities. 

Definition 7.10 ( Step function). A function ct defined on [a, 6] is called a step function 
if there is a partition 

a = x x < x 2 < • • • < x„ = b 

such that a is constant on each open subinterval (x k _ ls x k ). The number cc(x k -f ) — 
a(x k — ) is called the jump at x k if 1 < k < n. The jump at x x is a(xj -f) — a(xj), 
and the jump at x n is a(x n ) — ot(x n — ). 

Step functions provide the connecting link between Riemann-Stieltjes integrals 
and finite sums: 

Theorem 7.11 (Reduction of a Riemann-Stieltjes integral to a finite sum). Let a be 

a step function defined on \a , 6] with jump ct k at x k , where x x , ... 9 x n are as described 
in Definition 7 . 10 . Let f be defined on [ a , 6] in such a way that not both f and a are 
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discontinuous from the right or from the left at each x k . Then \ b a f da exists and we 
have 

fix) da(x) = jh f(x k )a k . 

* a k = l 

Proof. By Theorem 7.4, ) b a f da can be written as a sum of integrals of the type 
considered in Theorem 7.9. 


One of the simplest step functions is the greatest-integer function. Its value at 
x is the greatest integer which is less than or equal to * and is denoted by [x]. 
Thus, [x] is the unique integer satisfying the inequalities [x] < x < [x] + 1. 


Theorem 7.12. Every finite sum can be written as a Riemann-Stieltjes integral. In 
fact, given a sum £Z=i a k , define f on [0, n) as follows: 


fix) = a k if k— \<x<k 

Then 

n n 

Z = Em 

k = 1 fc = 1 

where [x] is the greatest integer < x. 


(k = 1, 2, ... , n), fi 0) = 0. 


Cn 


fix) d\x]. 


Proof. The greatest-integer function is a step function, continuous from the ri gh t 
and having jump 1 at each integer. The function / is continuous from the left at 
1, 2 ,...,«. Now apply Theorem 7.11. 


7.10 EULER’S SUMMATION FORMULA 

We shall illustrate the use of Riemann-Stieltjes integrals by deriving a remarkable 
formula known as Euler's summation formula, which relates the integral of a 
function over an interval [a, 6] with the sum of the function values at the integers 
in [a, 6]. It can sometimes be used to approximate integrals by sums or, conversely, 
to estimate the values of certain sums by means of integrals. 


Theorem 7.13 (Euler’s summation formula). Iff has a continuous derivative f on 
[a, b\ then we have 


Z /(«) = 

a<n<b 


Cb 


fix) dx + 


Cb 


f(x)iix)) dx + fia)(ia)) - /(b)((b)). 


where ((*)) = x — [*]. When a and b are integers, this becomes 


Em = 


Cb 


n — a 


rb 


fix) dx + 


fix) ( x - [x] - ?\dx +M±M 


note. Y.a< n <b means the sum from n — [a] + 1 to n = [6]. 
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Proof. Applying Theorem 7.6 (integration by parts), we have 


f{x) d(x - [x]) + 


[x]) df(x) = f(b)(b - [ft]) - f(a)(a - [a]). 


Since the greatest-integer function has unit jumps at the integers [a] + 1, 
[a] + 2, . . . , [6], we can write 


/(x) d\x\ = £ f{n). 

a a<n<b 


If we combine this with the previous equation, the theorem follows at once. 


7.11 MONOTONICALLY INCREASING INTEGRATORS. UPPER AND 
LOWER INTEGRALS 

The further theory of Riemann-Stieltjes integration will now be developed for 
monotonically increasing integrators, and we shall see later (in Theorem 7.24) that 
for many purposes this is just as general as studying the theory for integrators which 
are of bounded variation. 

When a is increasing, the differences Aa* which appear in the Riemann- 
Stieltjes sums are all nonnegative. This simple fact plays a vital role in the develop- 
ment of the theory. For brevity, we shall use the abbreviation “a/* on [a, f>]” to 
mean that “a is increasing on [a, 6].” 

As stated earlier, to find the area of the region under the graph of a function 
/we consider Riemann sums £/(r*) Ax* as approximations to the area by means 
of rectangles. Such sums also arise quite naturally in certain physical problems 
requiring the use of integration for their solution. Another approach to these 
problems is by means of upper and lower Riemann sums. For example, in the case 
of areas, we can consider approximations from “above” and from “below” by 
means of the sums ]£A /* Ax* and £m* Ax*, where M k and m k denote, respectively, 
the sup and inf of the function values in the &th subinterval. Our geometric 
intuition tells us that the upper sums are at least as big as the area we seek, whereas 
the lower sums cannot exceed this area. (See Fig. 7. 1 .) Therefore it seems natural 
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to ask: What is the smallest possible value of the upper sums? This leads us to 
consider the inf of all upper sums, a number called the upper integral of /. The 
lower integral is similarly defined to be the sup of all lower sums. For reasonable 
functions (for example, continuous functions) both these integrals will be equal to 
Jo f( x ) dx. However, in general, these integrals will be different and it becomes an 
important problem to find conditions on the function which will ensure that the 
upper and lower integrals will be the same. We now discuss this type of problem 
for Riemann-Stieltjes integrals. 

Definition 7.14. Let P be a partition of [a, 6] and let 

M k (f ) = sup {/(*) : x e !>*_!, * k ]}, 
m k (J) = inf {f{x) : x e [**_!, x k ]}. 

The numbers 

n n 

U{P,f, a) = '£2 M k (f) Aa* and L(P,f, a) = £ m k(f) Aa *> 

k= 1 k — 1 

are called, respectively, the upper and lower Stieltjes sums of f with respect to a for 
the partition P. 

note. We always have m k (f) < M k (f). If a S on [a, 6], then Aa k > 0 and we 
can also write m k (f ) Aa k < M k (f) Aa k , from which it follows that the lower sums 
do not exceed the upper sums. Furthermore, if t k e [x*_i, **], then 

m k (f) < f(t k ) < M k if). 

Therefore, when a , we have the inequalities 

L(P, f, a) < S(P,f a) < U(P,f a) 

relating the upper and lower sums to the Riemann-Stieltjes sums. These inequali- 
ties, which are frequently used in the material that follows, do not necessarily hold 
when a is not an increasing function. 

The next theorem shows that, for increasing a, refinement of the partition 
increases the lower sums and decreases the upper sums. 

Theorem 7.15. Assume that a A on [a, 6], Then: 

i) If P' is finer than P, we have 

U(P',f a) < U(P,f a) and L(P',f, a) > L(P, f a). 

ii) For any two partitions P t and P 2 , we have 


L(P u f, a) < U(P 2 ,f a). 
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Proof. It suffices to prove (i) when P r contains exactly one more point than P , 
say the point c. If c is in the zth subinterval of P , we can write 

n 

U(P',f, a) = X) M k(f) Aa* + M'[a(c) - a(X(_i)] + M"[a(x f ) - a(c)], 

k=l 

k±i 

where M' and M" denote the sup of /in [jc f _ t , c] and [c, xj. But, since 

M' < Mif) and M" < (J), 

we have U(P’,f a) < U(P,f a). (The inequality for lower sums is proved in a 
similar fashion.) 

To prove (ii), let P = P t u P 2 . Then we have 

L(P u f, a) £ L(P, f a) < U(P, f a) < U(P 2 ,f, a). 
note. It follows from this theorem that we also have (for increasing a) 
m\a(b) - a(a)] < L(P u f, a) < U(P 2 ,f, a) < M[a(b) - a(a)], 
where M and m denote the sup and inf of / on [a, &]. 

Definition 7.16. Assume that a A on [a, b~\. The upper Stieltjes integral of f with 
respect to a is defined as follows: 

^f da = inf {U(P,f, a) : P e &\a, &]}. 

Jfl 

The lower Stieltjes integral is similarly defined: 

m b 

f da = sup {L{P, f a) : Pe &\a, &]}. 

Jfl 

note. We sometimes write I(f, a) and /(/ a) for the upper and lower integrals. 
In the special case where a(x) = x, the upper and lower sums are denoted by 
U(P,f) and L(P, f) and are called upper and lower Riemann sums. The corre- 
sponding integrals, denoted by JJ/(x) dx and by f(x) dx, are called upper and 
lower Riemann integrals. They were first introduced by J. G. Darboux (1875). 

Theorem 7.17. Assume that a A on [a, b]. Then ](f, a) < i(f, «)• 

Proof If e > 0 is given, there exists a partition P t such that 

U(P u f a) < /(/ a) + e. 

By Theorem 7.15, it follows that /(/, a) + e is an upper bound to all lower sums 
UP, f, a). Hence, /(/ a) < /(/ a) + e, and, since e is arbitrary, this implies 
/(/, a) < /(/, a). 

Example. It is easy to give an example in which /(/, a) < /(/, a). Let a(x) = x and 
define / on [0, I ] as follows: 

f(x) = 1 , if x is rational, f(x) = 0, if x is irrational. 
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Then for every partition P of [0, 1 ], we have M k (f) = 1 and m k (f) = 0, since every 
subinterval contains both rational and irrational numbers. Therefore, U (P, /) = 1 and 
L(P,f) = 0 for all P. It follows that we have, for [a, b] = [0, 1 ], 

/•ft /•*> 

I f dx = 1 and I f dx — 0. 

Ja Ja 

Observe that the same result holds if f(x) = 0 when x is rational, and f(x) = 1 when x is 
irrational. 


7.12 ADDITIVE AND LINEARITY PROPERTIES OF UPPER AND 
LOWER INTEGRALS 

Upper and lower integrals share many of the properties of the integral. For ex- 
ample, we have 



m C 

f da + 

Ja 



f da. 


if a < c < b, and the same equation holds for lower integrals. However, certain 
equations which hold for integrals must be replaced by inequalities when they are 
stated for upper and lower integrals. For example, we have 


and 


f 

r 


(/ + 9 ) da 


(/ + 9) da 


-r 


Ja 


f doc + g dcc 9 


Ja 


f dot. + I g dot. 


These remarks can be easily verified by the reader. (See Exercise 7.11.) 


7.13 RIEMANN’S CONDITION 

If we are to expect equality of the upper and lower integrals, then we must also 
expect the upper sums to become arbitrarily close to the lower sums. Hence it 
seems reasonable to seek those functions / for which the difference U(P 9 f 9 a) — 
L(P, f 9 a) can be made arbitrarily small. 

Definition 7.18 . We say that f satisfies Riemanris condition with respect to a on 
[a, 6] if, for every e > 0, there exists a partition P c such that P finer than P E implies 

0 < £/(P, /, a) - L(P,f a) < e. 

Theorem 7.19. Assume that cl/' on [a, b~\. Then the following three statements are 
equivalent: 

i) / e R( a) on [a, 6]. 
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ii) / satisfies Riemanrf s condition with respect to ol on [ a , &]. 

iii) /(/, a) = /(/, a). 

Proof. We will prove that part (i) implies (ii), part (ii) implies (iii), and part (iii) 
implies (i). Assume that (i) holds. If a(b) = a(a), then (ii) holds trivially, so we 
can assume that a(a) < a(b). Given e > 0, choose P e so that for any finer P and 
all choices of t k and t' k in [**_!, xj, we have 

n n 

Z) f( { k) - A < | and /(/*) A a k - A < - , 

*=i 3 k=i 3 

where A = / da. Combining these inequalities, we find 

^ ' [/(f*) — Aa k < — e. 

k=l 3 

Since M k (f) - m k (f) = sup {/(*) - f(x' ) : *, x' in [x t _„ x fc ]}, it follows that 
for every h > 0 we can choose t k and t k so that 

f(t k ) ~ A® > M k (f) - m k (f) - h. 

Making a choice corresponding to h — &l[a(b) - a(a)], we can write 

n 

U( p ,f> a) - UP,f, a) = X) V M k(f) ~ "»*(/)] A a k 

*=i 

n n 

< ZJ [/(?*) - Aa k + h 2 Aa t < e. 

Hence, (i) implies (ii). 

Next, assume that (ii) holds. If e > 0 is given, there exists a partition P t such 
that P finer than P c implies U(P,f a) < L(P, f a) + e. Hence, for such P we 
have 

/(/, «) < U(P, f a) < L(P, f a) + e < /(/, a) + e. 

That is, /(/, a) < /(/, a) + e for every e > 0. Therefore, /(/, a) < /(/, a). But, 
by Theorem 7.17, we also have the opposite inequality. Hence (ii) implies (iii). 

Finally, assume that I(f, a) = I(f, a) and let A denote their common value. 
We will prove that l b a f da exists and equals A. Given e > 0, choose P' t so that 
U(P,f, a) < I(f, a) + e for all P finer than /*'. Also choose P" such that 

L{P, f a) > Iff a) - e 

for all P finer than P". If P t = P' t u P" t , we can write 

/(/, a) - e < L{P,f a) < S(P,f a) < U(P,f a) < /(/, a) + £ 

for every P finer than P e . But, since I(f, a) = I(f a) = A, this means that 
\S(P,f, <x) — A\ < £ whenever P is finer than P t . This proves that \ b a f da exists 
and equals A, and the proof of the theorem is now complete. 
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7.14 COMPARISON THEOREMS 


Theorem 7.20. Assume that a /'on [a, b~]. If f e R( a) and g e R( a) on [a, 6] and 
if fix) < g(x) for all x in [a, b~], then we have 

j* f{x) da(x) < j* g(x) da(x). 

Proof. For every partition P , the corresponding Riemann-Stieltjes sums satisfy 

n n 

S(P,f, a ) = 23 fOk) ~ 23 90k) ^ a k = 9 9 a )> 

*=1 k = 1 

since ccs on [a, 6]. From this the theorem follows easily. 

In particular, this theorem implies that J* g(x) dcn{x) > 0 whenever g(x) > 0 
and a/ on [ a 9 b ]. 


Theorem 7.21. Assume that as on [ a , 6]. 7//e R(a) on [a, 6], |/| e R( a) on 

[a, 6] awd we /Ae inequality 


C b 

f(x) doc(x) 
Ja 


< 



|/(x)| daix). 


Proof. Using the notation of Definition 7.14, we can write 


M kif) ~ m kif) = sup {/(x) - f{y) : x, y in [**_„ xj}. 

Since the inequality | |/(x)| - |/(y)|| < |/(x) - f{y)\ always holds, it follows that 
we have 

M k {\f\) - m k (\f\) < M k (f) - m k (f). 

Multiplying by Aa k and summing on k, we obtain 


U(P, |/|, a) - L(P, |/|, a) < U(P, f a) - L(P,f a), 

for every partition P of [a, 6], By applying Riemann’s condition, we find that 

I/I 6 ^(«) on [a, b~]. The inequality in the theorem follows by taking q = \f \ in 
Theorem 7.20. 

note. The converse of Theorem 7.21 is not true. (See Exercise 7.12.) 

Theorem 7.22. Assume that a. S' on [a, 6]. If f e R(a) on [a, b\ then f 2 e R(ct) on 

[a, b ]. 

Proof Using the notation of Definition 7.14, we have 

M k (f 2 ) = [M*(|/|)] 2 and m k (f 2 ) = [m k (|/|)] 2 . 

Hence we can write 

Mkif- 2 ) - m k if 2 ) = [M k (|/|) + m k i\m[M k i\f\) - m*(|/|)] 

< 2M\_M k i\f\) - m k i\f\)l 
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where M is an upper bound for |/| on \a, b~\. By applying Riemann’s condition, 
the conclusion follows. 

Theorem 7.23. Assume that as on [a, b~]. Iff e R( a) and g e R(u) on [a, b~], then 
the product f-g e R(a) on [a, b\ 

Proof We use Theorem 7.22 along with the identity 

2 f(x)g(pc) = [/(x) + gixj] 1 - [/(x)] 2 - [>(x)] 2 . 

7.15 INTEGRATORS OF BOUNDED VARIATION 

In Theorem 6.13 we found that every function a of bounded variation on [a, 6] 
can be expressed as the difference of two increasing functions. If a = — ot 2 is 

such a decomposition and if/e R( otj) and / e R(a 2 ) on [«, 6], it follows by linearity 
that fe R( a) on \a, 6]. However, the converse is not always true. If / e R( a) on 
[a, b~], it is quite possible to choose increasing functions a x and ot 2 such that 
a = aj — a 2 , but such that neither integral \ b a f da u \ b a f da 2 exists. The difficulty, 
of course, is due to the nonuniqueness of the decomposition a = — a 2 . How- 

ever, we can prove that there is at least one decomposition for which the converse 
is true, namely, when is the total variation of a and ot 2 = a 2 — a. (Recall 
Definition 6.8.) 

Theorem 7.24. Assume that a is of bounded variation on [a, 6]. Let V(x) denote the 
total variation of a on [a, x] if a < x < b, and let V(a) = 0. Let f be defined and 
bounded on \a, &]. If f e R( a) on [a, 6], then f e R(V) on [a, 6], 

Proof If V(b) = 0, then V is contant and the result is trivial. Suppose therefore, 
that V(b ) > 0. Suppose also that |/(x)| < M if x e [a, £>]. Since V is increasing, 
we need only verify that f satisfies Riemann’s condition with respect to V on \a, b\ 
Given e > 0, choose P c so that for any finer P and all choices of points t k and 
t' k in [x*_i, xj we have 

n n 

2 [/('*) " /('*)] Aa * < and V(b) < l Aa *l + ~ • 

*=i 4 k= i 4 M 

For P finer than P c we will establish the two inequalities 

FI 

£ Wf) - m k (f)JAV k - | Aa fc |) < | , 

k= 1 2 

and 

n 

£ [M*(/) - m k (f)-] |AaJ < ^ , 

k= 1 2 

which, by addition, yield U(P,f V) — L(P, f V) < e. 
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To prove the first inequality, we note that AV k — |Aa t | > 0 and hence 


« n 

Z [ M k (f) - m k (f y](AV k - |AaJ) < 2M Z (AF* - |Aa*|) 

k= 1 k= 1 


= 2 M 


n 


V(b) - Z |Aa*| 

k= 1 


< 


£ 

2 


To prove the second inequality, let 

A(P ) = {k : Aa* > 0}, B(P) = {A: : Aa* < 0}, 
and let h = $ e/V(b ). If & e A(P), choose t k and t k so that 

Ah) ~M) > M k (f) - m k (f) - h; 

but, if A: 6 B(P), choose t k and t' k so that f(t k ) — f(t k ) > M k (f) - m k (J) - h. 
Then 

It 

Z l M k(f) - rn*(/)] |Aa fc | < Z U Ok) - /('*)] |Aa fc | 

k= 1 keA(P) 

n 

+ Z If Ok) — f0k)l |Aa fc | + h Z |Aa fc | 

keB(P) k=l 

n n 

= Z If Ok) - fODl Aa k + h Z |Aa*| 

fc= 1 fc=l 

£ , , T//LX £ £ £ 

< - + hF(h) = - + - = 

4 4 4 2 

It follows that / e R(V) on [a, h], 

note. This theorem (together with Theorem 6.12) enables us to reduce the theory 
of Riemann-Stieltjes integration for integrators of bounded variation to the case 
of increasing integrators. Riemann’s condition then becomes available and it 
turns out to be a particularly useful tool in this work. As a first application we shall 
obtain a result which is closely related to Theorem 7.4. 

Theorem 7.25. Let a be of bounded variation on [a, b~\ and assume that f e R(a ) on 
[a, h]. Then f e R( a) on every subinterval [c, J] of [a, h]. 

Proof Let V(x) denote the total variation of a on [a, jc], with V(a) — 0. Then 
a = V — (V — a), where both V and V — a are increasing on [a, h] (Theorem 
6.12). By Theorem 7.24 ,/ e R(V), and hence f e R(V — a) on [a, h]. Therefore, 
if the theorem is true for increasing integrators, it follows that f e R(V) on [c, d~\ 
and / e R(V — a) on [c, d], so f e R(oc) on [c, d~\. 
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Hence, it suffices to prove the theorem when a/ on [a, b\ By Theorem 7.4 
it suffices to prove that each integral Jj; / da and ) d f da exists. Assume that 
a < c < b. If P is a partition of [a, jc ], let A (P, x) denote the difference 

HP, x) = U{P, f a) - L(P, f a), 

of the upper and lower sums associated with the interval [a, jc ] . Since / e R(a) 
on \a, b~\, Riemann’s condition holds. Hence, if e > 0 is given, there exists a 
partition P t of [a, b~\ such that A (P, b) < e if P is finer than P e . We can assume 
that c e P e . The points of P £ in [a, c] form a partition P' e of [a, c]. If P' is a 
partition of [a, c] finer than P' t , then P = P' u P e is a partition of [a, b ] com- 
posed of the points of P' along with those points of P e in [c, b~\. Now the sum 
defining A(P\ c ) contains only part of the terms in the sum defining A (P, b ). Since 
each term is >0 and since P is finer than P t , we have 


HP', c ) ^ HP, b) < e - 


That is, P' finer than P' t implies HP' , c ) < Hence, / satisfies Riemann’s con- 
dition on [a, c ] and \ c a f da exists. The same argument, of course, shows that 
J d f da exists, and by Theorem 7.4 it follows that J d c f da exists. 

The next theorem is an application of Theorems 7.23, 7.21, and 7.25. 


Theorem 7.26. Assume f e R(a) and g e R( a) on [a, b~\, where a A on \a, b~\ 
Define 


and 


* 

% & 


P(x) = fit) da(t) 


G(x) 


I 


git) dait) if xe [a, &]. 


Then f e RiG), g e R(F), and the product fig e R(a) on [ a , 6], and we have 


m 

ft ^ 


fix)gix) da(x) = f fix) dG{x) 

a Ja 


• ^ 


= gix) dFix). 


Proof. The integral J b a f-g da exists by Theorem 7.23. For every partition P of 
[a, F] we have 


n 


SiP, f,G) = J2 fitk) 


*= 1 




JCk-l 


git) dait) = 


_t r 

j.._, 


fit k )9it) dait). 


and 


fix)gix) daix) = 


_ v r 

J..-, 


fit)git) dait). 
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Therefore, if M g = sup {|#(x)| : x e [a, 6]}, we have 


S(P, /, G ) - 



f-g da 


E r {f(h) - da(t) 
k=l 


< M, 


JL r x> t 

E f 

* 1 J*k-1 


n 


1/(0 - /(Ol <fa(0 < m 9 £ 


k=l 


' x k 


Xk-l 


[M*(/) - m k (f)2 d<x(t) 


= M g {U(P,f, a) - L(P, f a)}. 

Since / e /?(a), for every e > 0 there is a partition P c such that P finer than P £ 
implies U(P, f a) — L(P, f, a) < £. This proves that / e R(G) on [a, b~\ and 
that lapg da = JJ/ dG. A similar argument shows that g e R(F) on [a, b~] and 
that ) b a f-g da = ] b „g dF. 


note. Theorem 7.26 is also valid if a is of bounded variation on [a, b~\. 


7.16 SUFFICIENT CONDITIONS FOR EXISTENCE OF RIEMANN-STIELTJES 
INTEGRALS 

In most of the previous theorems we have assumed that certain integrals existed 
and then studied their properties. It is quite natural to ask : When does the integral 
exist? Two useful sufficient conditions will be obtained. 

Theorem 7.27. Iff is continuous on \a, 6] and if a is of bounded variation on \a, b~\, 
then f 6 R(g) on [a, b ]. 

note. By Theorem 7.6, a second sufficient condition can be obtained by inter- 
changing f and a in the hypothesis. 

Proof It suffices to prove the theorem when a/ 1 with a(a) < a(b). Continuity 
of / on [a, h] implies uniform continuity, so that if £ > 0 is given, we can find 
3 > 0 (depending only on e) such that |x — y\ <3 implies |/(x) — /(y)| < e/A, 
where A = 2[a(b) — a(a)]. If P t is a partition with norm ||/*J < 3, then for P 
finer than P e we must have 

M k (f) - m k (f) < e/A, 

since M k {f) - m k (J) = sup {/(*) - f(y) : x, y in **]}. Multiplying the 

inequality by Aa ft and summing, we find 

n 

U(P,f, a) - L(P, f, a) < - £ Aa* = i < e, 

A k=i 2 

and we see that Riemann’s condition holds. Hence,/ e R( a) on [ a , b~]. 

For the special case in which a(x) = x , Theorems 7.27 and 7.6 give the following 
corollary : 

Theorem 7.28 . Each of the following conditions is sufficient for the existence of the 
Riemann integral \ b a f{x) dx: 

a) / is continuous on [a, b~\ . b) / is of bounded variation on [a, 6]. 
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7.17 NECESSARY CONDITIONS FOR EXISTENCE OF RIEMANN-STIELTJES 
INTEGRALS 

When a is of bounded variation on [a, b\ continuity of /is sufficient for the exis- 
tence of l b a f da. Continuity of / throughout \a, b~\ is by no means necessary, 
however. For example, in Theorem 7.9 we found that when a is a step function, 
then / can be defined quite arbitrarily in [a, b ] provided only that / is continuous 
at the discontinuities of a. The next theorem tells us that common discontinuities 
from the right or from the left must be avoided if the integral is to exist. 

Theorem 7.29. Assume that a A on \a, 6] and let a < c < b. Assume further 
that both a and fare discontinuous from the right at x = c; that is, assume that there 
exists an e > 0 such that for every S > 0 there are values of x and y in the interval 
(c, c + 5) for which 

\f(x) - f(c ) | > 6 and |a(y) - a(c)| > e. 

Then the integral j*/(x) da{x) cannot exist. The integral also fails to exist if a and 
fare discontinuous from the left at c. 

Proof. Let P be a partition of [a, b ] containing c as a point of subdivision and 
form the difference 

FI 

U(P, f, a) - L(P, f, a) = 2 l M k(f) ~ >”*(/)] Aa *- 

*=i 

If the /th subinterval has c as its left endpoint, then 

U(P,f a) - L(P, f a) > [Mj(f) - m i (/)][a(x i ) - a(c)], 

since each term of the sum is >0. If c is a common discontinuity from the right, 
we can assume that the point x t is chosen so that a(x t ) — a (c) > e. Furthermore, 
the hypothesis of the theorem implies M t (f) — m t (f) > e. Hence, 

U(P,f a) - L(P, f a) > £ 2 , 

and Riemann’s condition cannot be satisfied. (If c is a common discontinuity 
from the left, the argument is similar.) 

7.18 MEAN-VALUE THEOREMS FOR RIEMANN-STIELTJES INTEGRALS 

Although integrals occur in a wide variety of problems, there are relatively few 
cases in which the explicit value of the integral can be obtained. However, it 
often suffices to have an estimate for the integral rather than its exact value. The 
Mean Value Theorems of this section are especially useful in making such estimates. 

Theorem 7.30 ( First Mean- Value Theorem for Riemann-Stieltjes integrals) . Assume 
that a A and let f e R(ot) on \a, b\ Let M and m denote, respectively, the sup and 
inf of the set {/(x) : x e \a, />]}. Then there exists a real number c satisfying 
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m < c < M such that 


Cb 


fix) doi(x) = c 


doi.{x) = c\oi.(b) — a(a)]. 


In particular , iff is continuous on [a, 6], then c = /(x 0 ) /or some x 0 /« [a, 6]. 

Proof If a(a) = a (£), the theorem holds trivially, both sides being 0. Hence we 
can assume that a (a) < a (b). Since all upper and lower sums satisfy 

m[a(b) - a(a)] < L(P, f a) < U(P,f, a) < M[a(b) - a(a)], 

the integral \ b a f da must lie between the same bounds. Therefore, the quotient 
c = (\ b a f da)](j b da) lies between m and M. When / is continuous on [a, Z>], the 
intermediate value theorem yields c = f(x 0 ) for some jc 0 in [a, 6]. 

A second theorem of this type can be obtained from the first by using integra- 
tion by parts. 

Theorem 7.31 (Second Mean-Value Theorem for Riemann-Stieltjes integrals). 
Assume that a is continuous and that f S' on [a, b\ Then there exists a point x 0 
in [a, Z>] such that 


• & 


b fxo fb 

f(x) da(x) = f (a) I dofx) + f(b) I da(x). 


xo 


Proof By Theorem 7.6, we have 

f{x) dcc(x) = f(b)oc(b) - f(a)a(a) - f a(x) df(x). 

Ja 


Applying Theorem 7.30 to the integral on the right, we find 

f(x) da(x) = f(a)[a(x 0 ) - a(a)] + f(b)[a(b) - a(x 0 )], 

2 

where x 0 6 [a, 6], which is the statement we set out to prove. 


7.19 THE INTEGRAL AS A FUNCTION OF THE INTERVAL 

If / e R(a) on [a, Z>] and if a is of bounded variation, then (by Theorem 7.25) the 
integral J* / da exists for each x in [a, 6] and can be studied as a function of x. 
Some properties of this function will now be obtained. 

t 

Theorem 7.32. Let a be of bounded variation on [a, 6] and assume that f e R( a) on 
[a, b\ Define F by the equation 

f da, if xe [a, b\ 


F(x) = 
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Then we have: 


i) F is of bounded variation on [a, b~\. 

ii) Every point of continuity of a is also a point of continuity of F. 

iii) If as on [a, 6], the derivative F'(x) exists at each point x in (a, b) where a'{x) 
exists and where f is continuous. For such x, we have 


F'(x) = f(x)a'(x). 


Proof. 

that 


It suffices to assume that a/ on [a, b~). If x i* y. Theorem 7.30 implies 


F(y) - F(x) = 



f da - c[cc(y) - a(x)], 


where m < c ^ M (in the notation of Theorem 7.30). Statements (i) and (ii) 
follow at once from this equation. To prove (iii), we divide by y — x and observe 
that c -*• f(x) as y -* x. 


When Theorem 7.32 is used in conjunction with Theorem 7.26, we obtain the 
following theorem which converts a Riemann integral of a product f-g into a 
Riemann-Stieltjes integral J \f dG with a continuous integrator of bounded 
variation. 


Theorem 7.33. Iff e R and g e R on [a, b~\, let 


Jc 


F(X) = f(t) dt, G(x) 


Ja 


g(t ) dt if xe [a, 6]. 


Then F and G are continuous functions of bounded variation on [a, 6]. 
/ e R(G) and g e R(F) on [a, b~], and we have 

f f(x)g(x) dx = j* f(x) dG(x) = |* g(x) dF(x). 

Ja Ja Ja 


Also, 


Proof. Parts (i) and (ii) of Theorem 7.32 show that F and G are continuous func- 
tions of bounded variation on [a, 6]. The existence of the integrals and the two 
formulas for f(x)g(x) dx follow by taking a(x) = x in Theorem 7.26. 


note. When a(x) = x, part (iii) of Theorem 7.32 is sometimes called the first 
fundamental theorem of integral calculus. It states that F'(x) = f{x) at each point 
of continuity of /. A companion result, called the second fundamental theorem, is 
given in the next section. 


7.20 SECOND FUNDAMENTAL THEOREM OF INTEGRAL CALCULUS 
The next theorem tells how to integrate a derivative. 

Theorem 7.34 (Second fundamental theorem of integral calculus). Assume that feR 
on [a, ti\. Let g be a function defined on [a, 6] such that the derivative g' exists in 
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(a, b ) and has the value 

g'(x) = f{x) for every x in {a, b). 

At the endpoints assume that g(a+) and g(b—) exist and satisfy 


Then we have 


g{a) - g{a+) = g{b) - gib-). 


I 


fix) dx 


J> 


(x) dx = gib) - gia). 


Proof. For every partition of [a, b ] we can write 

n n n 

gib) - gia) = 2 [>(**) ~ 0(**-i)] = 2 fik) Ax* = J2fi ( k) Ax*, 


k= 1 


k= 1 


k=l 


where t k is a point in (x fc _ l9 x k ) determined by the Mean- Value Theorem of 
differential calculus. But, for a given s > 0, the partition can be taken so fine that 


9(b) - g(a) 


-I 


fix) dx 


J^f(t k )Ax k - f f{x) dx 

‘=1 Ja 


< S, 


and this proves the theorem. 

The second fundamental theorem can be combined with Theorem 7.33 to give 
the following strengthening of Theorem 7.8. 

Theorem 735. Assume f e Ron [a, b ]. Let a be a function which is continuous on 
[a, b~\ and whose derivative a.' is Riemann integrable on [a, b~\. Then the following 
integrals exist and are equal: 


I 


b /• b 

fix) da(x) = I /(x)a'(x) dx. 

a Ja 


Proof. By the second fundamental theorem we have, for each x in [a, b\ 


a(x) — a (a) 


Ja 


(0 dt. 


Taking g = a' in Theorem 7.33 we obtain Theorem 7.35. 
note. A related result is described in Exercise 7.34. 


7.21 CHANGE OF VARIABLE IN A RIEMANN INTEGRAL 

The formula JJ/ da = h dp of Theorem 7.7 for changing the variable in an 
integral assumes the form 


/•»(<*) 

Jg(c) 


Jc 


f(x) dx = /IX0M0 dt. 
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when a(x) = x and when g is a strictly monotonic function with a continuous 
derivative g '. It is valid if f e R on \a , 6], When /* is continuous, we can use 
Theorem 7.32 to remove the restriction that g be monotonic. In fact, we have the 
following theorem : 

Theorem 7.36 ( Change of variable in a Riemann integral). Assume that g has a 
continuous derivative g' on an interval [c, </]. Let f be continuous on g([c 9 </]) and 
define F by the equation 

F ( x ) = f /( 0 dt if xe g([c 9 d]). 

Jg(c) 

Then, for each x in [c, <f\ the integral \ x c f[g(t)'\g'(t) dt exists and has the value 
F[g(x)~]. In particular, we have 

rd 

f(x) dx = f[g(ty]g'(t) dt. 

Jff(c ) Jc 

Proof Since both g f and the composite function fog are continuous on [ c 9 </] 
the integral in question exists. Define G on [c, <f] as follows : 

GO) = J* f[g(ty]g'(t) dt. 

We are to show that G(x) = F[g(x)]. By Theorem 7.32, we have 

G'O) = f[g(x)]g'(x), 

and, by the chain rule, the derivative of .F[>0)] is also f[g(x)]g'(x), since F'(x) = 
f(x). Hence, G(x) — F[g(x)~] is constant. But, when x = c, we get G(c) = 0 and 
9(c)] = 0, so this constant must be 0. Hence, GO) = F[^0)] for all x in 
[c, d~\. In particular, when x = d, we get G(d) = F[g(dj\ and this is the last 
equation in the theorem. 

note. Some texts prove the preceding theorem under the added hypothesis that 
g' is never zero on [c, d], which, of course, implies monotonicity of g. The above 
proof shows that this is not needed. It should be noted that g is continuous on 
[c, rf], so g{[c, df) is an interval which contains the interval joining g(c) and g(d). 



Figure 7.2 
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In particular, the result is valid if g(c) = g(d). This makes the theorem especially 
useful in the applications. (See Fig. 7.2 for a permissible g.) 

Actually, there is a more general version of Theorem 7.36 which does not 
require continuity of / or of g', but the proof is considerably more difficult. Assume 
that h e R on [c, rf] and, if x e [c, d\ let g( x) = KO dt, where a is a fixed 
point in [c, d~\. Then if f e R on g{[c, df) the integral Jc/[#(0] W) dt exists and 
we have 



f(x) dx = 


n 


/l>(OXO dt. 


This appears to be the most general theorem on change of variable in a Riemann 
integral. (For a proof, see the article by H. Kestelman, Mathematical Gazette, 
45 (1961), pp. 17-23.) Theorem 7.36 is the special case in which h is continuous on 
[c, d~\ and / is continuous on g(\c, </]). 


7.22 SECOND MEAN-VALUE THEOREM FOR RIEMANN INTEGRALS 

Theorem 737. Let g be continuous and assume that f A on [a, 6]. Let A and B be 
two real numbers satisfying the inequalities 


A < f(a+) and B > fib — ). 


Then there exists a point x 0 in [a, b] such that 


i) 


rb 


f(x)g(x) dx = A 


'XQ 


g(x) dx + B 


rb 


xo 


g(x) dx. 


In particular , if f(x) > 0 for all x in [a, 6], we have 



rb 


f( x )g(x) dx = B 


g(x) dx , 

*0 


where x 0 e [a, 6]. 


note. Part (ii) is known as Bonnet's theorem. 

Proof. If a(x) = J* g(t) dt, then a' = g. Theorem 7.31 is applicable, and we get 


f(x)g(x) dx = /(a) 


’*0 


g(x) dx + f{b) 


Ja 


Ja 


g(x) dx. 


*0 


This proves (i) whenever A = f(a) and B = fib). Now if A and B are any two 
real numbers satisfying A < fia + ) and B > fib — ), we can redefine / at the end- 
points a and b to have the values /(a) = A and /(h) = B. The modified / is still 
increasing on [a, b] and, as we have remarked before, changing the value of / at 
a finite number of points does not affect the value of a Riemann integral. (Of 
course, the point x 0 in (i) will depend on the choice of A and B.) By taking A — 0, 
part (ii) follows from part (i). 
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7.23 RIEMANN-STIELTJES INTEGRALS DEPENDING ON A PARAMETER 
Theorem 7.38. Let f be continuous at each point (x, y ) of a rectangle 


Q = {(x,y) \ a <. x < b, c <> y <, d}. 


Assume that a is of bounded variation on [a, b~\ and let F be the function defined on 
[c, <T\ by the equation 

F(y) = f(x, y) da(x). 

Ja 

Then F is continuous on [c, <T\. In other words, if y 0 e [c, </], we have 


lim f f(x, y) dct(x) 
y^yo Ja 



lim fix, y) da(x) 

y-*yo 


fix, y„) da(x). 


Proof Assume that a/' on [a, b]. Since Q is a compact set, /is uniformly con- 
tinuous on Q. Hence, given e > 0, there exists a 5 > 0 (depending only on e) 
such that for every pair of points z = (*, y) and z' = (x\ y') in Q with |z - z'| < <5, 
we have | f(x, y) — fix', y')\ < e. If | y — y'\ < 8, we have 


\F(y) - f(/)l 


< 


rb 


I fix, y) - fix, y')| daix) < s[a(h) - a(a)]. 


This establishes the continuity of F on [c, d\ 

Of course, when a(x) = x, this becomes a continuity theorem for Riemann 
integrals involving a parameter. However, we can derive a much more useful 
result for Riemann integrals than that obtained by simply setting a(x) = x if we 
employ Theorem 7.26. 


Theorem 7.39. If f is continuous on the rectangle [a, b~\ x [c, <f], and if g e R on 
\a, b], then the function F defined by the equation 



rt 


gix)fix, y) dx. 


is continuous on [c, d~\. That is, if y 0 e [c, <T\, we have 


-b 

lim gix)fix, y) dx 

y-*yo Ja 


gix)fix, y 0 ) dx. 


Proof If G(x) = Jo git) dt. Theorem 7.26 shows that F(y) = J£/(x, y) dGix). 
Now apply Theorem 7.38. 
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7.24 DIFFERENTIATION UNDER THE INTEGRAL SIGN 


Theorem 7.40. Let Q = {(x, y) : a < x < b, c < y < d}. Assume that a is of 
bounded variation on [a, 6] and, for each fixed y in [c, d], assume that the integral 

F(y) = f f{x, y) dtx(x). 


exists. If the partial derivative D 2 f is continuous on Q, the derivative F’{y) exists 
for each y in (c, d) and is given by 


F'iy) 


rt 


D 2 f{x, y) dtx(x). 


note. In particular, when g e Ron [a, b~\ and a(x) = J* g(t) dt, we get 


F(y) 


-f 

Ja 


rb 


g(x)f(x, y) dx and F'(y) = 


g(x) D 2 f(x, y) dx. 


Proof If y 0 e ( c , d) and y i* y 0 , we have 

~ nyJ = f'Mz* i*l> Mx) , P Dlf(x , y) Mx} , 

y - yo Ja y - yo J« 

where y is between y and y 0 . Since D 2 f is continuous on Q, we obtain the con- 
clusion by arguing as in the proof of Theorem 7.38. 


7.25 INTERCHANGING THE ORDER OF INTEGRATION 


Theorem 7.41. Let Q = {(x, y) : a < x < b, c < y < d}. Assume that a is of 
bounded variation on [a, b], p is of bounded variation on [c, (f], and f is continuous 
on Q. If (x, y) e Q, define 




fix, y) daix). 



fix, y) dpiy). 

Jc 


Then F e R{p) on [c, d\ G e R(oc) on [a, 6], and we have 

j* Fiy) dPiy) = j* G(x) da(x). 

In other words, we may interchange the order of integration as follows: 



fix, y) dpiy) 




daix) = 



fix, y) daix) 




dpiy)- 


Proof By Theorem 7.38, Fis continuous on [c, <f] and hence F e R{fi) on [c, <f\. 
Similarly, G e /?(a) on [a, b~\. To prove the equality of the two integrals, it suffices 
to consider the case in which ay on [a, b~\ and py on [c, d\. 
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By uniform continuity, given e > 0 there is a S > 0 such that for every pair of 
points z = (x, v) and z' — (x r , y') in Q, with \z — z'\ <8, we have 

I f(x, y) - f{x', y')\ < e. 

Let us now subdivide Q into n 2 equal rectangles by subdividing [a, b~] and [ c , d~\ 
each into n equal parts, where n is chosen so that 


(b — a) 8 

< 


n 


V2 


and 


Writing 


x k - a + 


k(b — a) 


and 


n 


for k = 0, 1, 2, ... , n, we have 


*b / Cd \ n— In — 1 

f(x, y) dpi y) ) da(x) = 


(d - c) 8_ 

S' 


n 


y k = c + 


k(d — c) 


n 


a \j 


e t r ( 

* = ° J = ° Jx k V 


yj+i 


yj 


f(x, y ) dp(y)j d<x(x). 


We apply Theorem 7.30 twice on the right. The double sum becomes 


E E /(**> yj)[fi(yj+ 1) - /?(y y )][a(x* +1 ) - <*(**)], 

fc=o y=o 


where (x' k , yj) is in the rectangle Q kJ having (x k , yj) and (**+,, y J+1 ) as opposite 
vertices. Similarly, we find 



f(x, y) da(x) ) dfi(y) 


n — 1 n— 1 


- E E/(**> y’j)[P(y J+ 1 ) - 0(y,)][>(x* + 1 ) - <*(**)] , 


k = 0 j = 0 


where (x'j yj) e Q kJ . But \f(x' k , yj) - /(*", y j)| < e and hence 


rb 


G(x ) d<x(x) — 


rd 


F(y) dft(y) 


n—l 


< 


y= o 


n—l 

e E C^(^+i) - P( yy)] X [«(**+ i) - «(**)] 

* = 0 


= W) - /?(c)][«(b) - «(«)]. 

Since e is arbitrary, this implies equality of the two integrals. 

Theorem 7.41 together with Theorem 7.26 gives the following result for Rie- 
mann integrals. 
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Theorem 7.42. Let f be continuous on the rectangle [a, 6] x [c, d\ If g e R on 
[a, 6] and if h e R on [ c , cf\, then we have 


g(x)h(y)f(x, >') dy dx 


g(x)h(y)f(x, y) dx dy. 


Proof Let a(x) = f* g(u) du and let f}(y) = Jj h(v) do, and apply Theorems 7.26 
and 7.41. 


7.26 LEBESGUE’S CRITERION FOR EXISTENCE OF RIEMANN INTEGRALS 

Every continuous function is Riemann integrable. However, continuity is certainly 
not necessary, for we have seen that f e R when f is of bounded variation on \_a, b~\. 
In particular,/ can be a monotonic function with a countable set of discontinuities 
and yet the integral f ‘ f{x) dx will exist. Actually, there are Riemann-integrable 
functions whose discontinuities form a noncountable set. (See Exercise 7.32.) 
Therefore, it is natural to ask “how many” discontinuities a function can have and 
still be Riemann integrable. The definitive theorem on this question was dis- 
covered by Lebesgue and is proved in this section. The idea behind Lebesgue’s 
theorem is revealed by examining Riemann’s condition to see the kind of restriction 
it puts on the set of discontinuities of /. 

The difference between the upper and lower Riemann sums is given by 

rt 

]C l M k(f) - '”*(/)] Ax*, 

k=l 

and, roughly speaking, / will be integrable if, and only if, this sum can be made 
arbitrarily small. Split this sum into two parts, say 5! + S 2 , where 5* comes from 
subintervals containing only points of continuity of /, and S 2 contains the re- 
maining terms. In S u each difference M k (f) - m k (f) is small because of continuity 
and hence a large number of such terms can occur and still keep S 1 small. In S 2 , 
however, the differences M k (f) - m k (f) need not be small ; but because they are 
bounded (say by M), we have |S 2 | < M £Ax*, so that S 2 will be small if the sum 
of the lengths of the subintervals corresponding to S 2 is small. Hence we may 
expect that the set of discontinuities of an integrable function can be covered by 
intervals whose total length is small. 

This is the central idea in Lebesgue’s theorem. To formulate it more precisely 
we introduce sets of measure zero. 

Definition 7.43. A set S of real numbers is said to have measure zero if, for every 
e > 0, there is a countable covering of S by open intervals, the sum of whose lengths 
is less than e. 

0 

If the intervals are denoted by ( a k , b k ), the definition requires that 

S £ U ( a k, h) and £ <». - *„) < «• 

k k 


( 3 ) 
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If the collection of intervals is finite, the index k in (3) runs over a finite set. If the 
collection is countably infinite, then k goes from 1 to oo, and the sum of the lengths 
is the sum of an infinite series given by 

00 J V 

12 (h ~ ad = lim 12 ( b k - a k ). 

k— 1 N -* oo k= 1 

Besides the definition, we need one more result about sets of measure zero. 
Theorem 7.44 . Let F be a countable collection of sets in R, say 

F = F 2 , . . . }, 

each of which has measure zero. Then their union 

S = 0 F» 

k=i 

also has a measure zero. 

Proof Given e > 0, there is a countable covering of F k by open intervals, the sum 
of whose lengths is less than z\2 k . The union of all these coverings is itself a 
countable covering of S by open intervals and the sum of the lengths of all the 
intervals is less than 

£ 


Examples. Since a set consisting of just one point has measure zero, it follows that every 
countable subset of R has measure zero. In particular, the set of rational numbers has 

measure zero. However, there are uncountable sets which have measure zero. (See Exer- 
cise 7.32.) 

Next we introduce the concept of oscillation. 

Definition 7.45. Let f be defined and bounded on an interval S. If T ^ S, the 
number 

£l f (T) = sup {/(*) — f(y) : x e T, ye T}, 
is called the oscillation offon T. The oscillation of f at x is defined to be the number 

<o f (x) = lim £1 f (B(x ; h) n S). 

# i -»0 + 

NOTE. This limit always exists, since Cl f (B(x ; It) n S) is a decreasing function of 
h. In fact, T t £ T 2 implies fi/r,) < Cl f (T 2 ). Also, co f (x) = 0 if, and only if, 
/is continuous at x (Exercise 4.24). 

The next theorem tells us that if co f (x) < e at each point of a compact interval 
[a, b\ then Cl f (T) < s for all sufficiently small subintervals T. 

Theorem 7.46. . Let f be defined and bounded on [ a , 6], and let s > 0 be given. 
Assume that (o f (x) < s for every x in [ a , F]. Then there exists a 8 > 0 {depending 
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only on e) such that for every closed subinterval T c [a, b], we have £2,(7’) < e 
whenever the length of T is less than 8. 

Proof For each x in [ a , b] there exists a 1-ball B x = B(x; 8 X ) such that 

Cl f (B x n [a, b)) < aifix) + [e - rn/x)] = e. 

The set of all halfsize balls B(x; 8J2) forms an open covering of [a, 6]. By 
compactness, a finite number (say k) of these cover [a, 6], Let their radii be 
d 1 12, ■ • f dJ2 and let 8 be the smallest of these lc numbers. When the interval 
T has length <8, then T is partly covered by at least one of these balls, say by 
B( x P l dpi 2). However, the ball B(x p ; 8 p ) completely covers T (since 8 p ;> 28). 
Moreover, in B(x p ; 8 p ) n [a, 6] the oscillation of / is less than e. This implies 
that Qf(T) < g and the theorem is proved. 

Theorem 7.47. Let f be defined and bounded on [a, b~\. For each e > 0 define the 
set J e as follows: 

J t = {x : x e [a, b], o) f (x) > e}. 

Then J e is a closed set. 

Proof Let x be an accumulation point of J s . If x £ J e , we have co ,(x ) < e. 
Hence there is a 1-ball B(x) such that 

fl f (B(x) n [a, b]) < e. 

Thus no points of B(x) can belong to J t , contradicting the statement that x is an 
accumulation point of J c . Therefore, x e J e and J c is closed. 

Theorem 7.48 (Lebesgue’s criterion for Riemann-integrability). Let f be defined 
and bounded on [a, b] and let D denote the set of discontinuities off in [a, b]. Then 
f e R on [a, b~\ if and only if, D has measure zero. 

Proof. ( Necessity ). First we assume that D does not have measure zero and show 
that /is not integrable. We can write D as a countable union of sets 

D = 0 D„ 

r= 1 

where 

£> r = |x : a> f (x) > H . 

If x e D, then (o f (x) > 0, so D is the union of the sets D r for r = 1,2,... 

Now if D does not have measure zero, then some set D r does not (by Theorem 
7.44). Therefore, there is some e > 0 such that every countable collection of open 
intervals covering D r has a sum of lengths > e. For any partition P of [a, b\ we 
have 

viP. n - U P, n = ± [*,«) - ^ 
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where S t contains those terms coming from subintervals containing points of D 
in their interior, and S 2 contains the remaining terms. The open intervals from S t 
cover D r except possibly for a finite subset of D r , which has measure 0, so the sum 
of their lengths is at least s. Moreover, in these intervals we have 

M k (f) — m k (f) > - and hence S x > - . 

r r 

This means that 

U(P, f) - L(P, /) > - , 

r 

for every partition P , so Riemann’s condition cannot be satisfied. Therefore /is 
not integrable. In other words, if/e R, then D has measure zero. 

{Sufficiency). Now we assume that D has measure zero and show that the 
Riemann condition is satisfied. Again we write D = \}^LuD n where Z> r is the set of 
points x at which co f {x) > l Jr. Since D r <= Z>, each D r has measure 0, so D r can 
be covered by open intervals, the sum of whose lengths is < 1/r. Since D r is compact 
(Theorem 7.47), a finite number of these intervals cover D r . The union of these 
intervals is an open set which we denote by A r . The complement B r = [a, 6] — A r 
is the union of a finite number of closed subintervals of [a, b\ Let / be a typical 
subinterval of B r . If x e /, then co f {x) < 1/r so, by Theorem 7.46, there is a d > 0 
(depending only on r) such that / can be further subdivided into a finite number of 
subintervals T of length <8 in which fi/(T) < 1/r. The endpoints of all these 
subintervals determine a partition P r of [a, b]. If P is finer than P r we can write 

R 

U(P, f) - L(P, /) = £ [M*(/) - m*(/)] Ax* = + S 2 , 

k = 1 

where S t contains those terms coming from subintervals containing points of 
D n and S 2 contains the remaining terms. In the ktYi term of S 2 we have 

M k (f) — m k (f) < - and hence S 2 < . 

r r 

Since A r covers all the intervals contributing to S u we have 




where M and m are the sup and inf of / on [ a , b\ Therefore 


U(P , /) - L(P , /) < 


m + b — a 


Since this holds for every r > 1 , we see that Riemann’s condition holds, so fsR 
on [a, b]. 

note. A property is said to hold almost everywhere on a subset S of R 1 if it holds 
everywhere on S except for a set of measure 0. Thus, Lebesgue’s theorem states 
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that a bounded function / on a compact interval [a, 6] is Riemann integrable on 
[a, fr] if, and only if, /is continuous almost everywhere on [a, ti\. 

The following statements (some of which were proved earlier in the chapter) 
are immediate consequences of Lebesgue’s theorem. 

Theorem 7.49. a) Iff is of bounded variation on [a, b], then f e Ron [a, b~\. 

b) If /e R on [a, 6], then feRon [c, d~\ for every subinterval [ c , d~\ a [a, b\ 
I/I e R and f 2 eR on [a, &]. Also, fge R on [a, b] whenever g e R on 
[a, bl 

c ) Iff e R and g e Ron [a, 6], then fig e Ron [a, 6] whenever g is bounded away 
from 0. 

d) If f and g are bounded functions having the same discontinuities on [a, b\ then 
fe Ron [a, 6] if, and only if, ge R on [a, 6], 

e) Let ge R on [a, b~] and assume that m < g(x) < M for all x in [a, b~\. Iff is 
continuous on [m, M ], the composite function h defined by h(x) = /[<?(x)] Is 
Riemann-integrable on [a, b~\. 

note. Statement (e) need not hold if we assume only that f e R on [m, M\ 
(See Exercise 7.29.) 


7.27 COMPLEX-VALUED RIEMANN-STIELTJES INTEGRALS 

Riemann-Stieltjes integrals of the form \ b a f da, in which / and a are complex- 
valued functions defined and bounded on an interval [a, b J, are of fundamental 
importance in the theory of functions of a complex variable. They can be intro- 
duced by exactly the same definition we have used in the real case. In fact, 
Definition 7. 1 is meaningful when / and a are complex-valued. The sums of the 
products /(A)[o£(xjt) — a.{x k _ j)J which are used to form Riemann-Stieltjes sums 
need only be interpreted as sums of products of complex numbers. Since complex 
numbers satisfy the commutative, associative, and distributive laws which hold 
for real numbers, it is not surprising that complex-valued integrals share many of 
the properties of real-valued integrals. In particular, Theorems 7.2, 7.3, 7.4, 7.6, 
and 7.7 (as well as their proofs) are all valid (word for word) when / and a are 
complex-valued functions. (In Theorems 7.2 and 7.3, the constants c t and c 2 may 
now be complex numbers.) In addition, we have the following theorem which, in 
effect, reduces the theory of complex Stieltjes integrals to the real case. 


Theorem 7.50. Let f — f\ + if i and a = a t + ia 2 be complex-valued functions 
defined on an interval [a, 6] . Then we have 


fb 


f da = 


Cb 


fl d <*l ~ 


Cb 


f 2 da 2 | + i 


Cb 


fl da l + 


Cb 


Ja 


fi da. 


whenever all four integrals on the right exist. 
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The proof of Theorem 7.50 is immediate from the definition and is left to the 
reader. 

The use of this theorem permits us to extend most of the important properties 
of real integrals to the complex case. For example, the connection between 
differentiation and integration established in Theorem 7.32 remains valid for 
complex integrals if we simply define such notions as continuity, differentiability 
and bounded variation by components, as with vector-valued functions. Thus, we 
say that the complex-valued function a = + /a 2 is of bounded variation on 

[<z, 6] if each component and a 2 is of bounded variation on [ a 9 b]. Similarly, 
the derivative a'(0 is defined by the equation a'(0 = a'i(0 + ia 2 (0 whenever the 
derivatives ol\ (f) and a 2 (f) exist. (One-sided derivatives are defined in the same 
way.) With this understanding, Theorems 7.32 and 7.34 (the fundamental theorems 
of integral calculus) both remain valid when / and a are complex-valued. The 
proofs follow from the real case by using Theorem 7.50 in a straightforward 
manner. 

We shall return to complex- valued integrals in Chapter 16, when we study 
functions of a complex variable in more detail. 


EXERCISES 


Riemann-Stieltjes integrals 

7.1 Prove that da{ x) = a(b) — a(a), directly from Definition 7.1. 

7.2 life R(a) on [a 9 b] and if fcfda = 0 for every /which is monotonic on [a 9 b], 
prove that a must be constant on [a 9 b]. 

7.3 The following definition of a Riemann-Stieltjes integral is often used in the literature: 
We say /is integrable with respect to a if there exists a real number A having the property 
that for every s > 0, there exists a <5 > 0 such that for every partition P of [a 9 b] with 
norm ||P|| < <5 and for every choice of t k in [**_!, x k ] 9 we have \S(P 9 f 9 a) — A\ < e. 

a) Show that if da exists according to this definition, then it also exists according 
to Definition 7.1 and the two integrals are equal. 


b) Let f(x) = oc(jc) = 0 for a < x < c 9 f(x) = a(x) = 1 for c < x < b 9 f(c) = 0, 
a(c) = 1. Show that /£/ da exists according to Definition 7.1 but does not exist 
by this second definition. 


7.4 life R according to Definition 7.1, prove that /£/(*) dx also exists according to the 
definition of Exercise 7.3. [Contrast with Exercise 7.3(b).] Hint. Let I = /£/(*) dx 9 
M = sup {|/(x)| : x e [a 9 b]}. Given s > 0, choose P e so that U(P e9 f) < I + s/2 
(notation of Section 7.11). Let N be the number of subdivision points in P e and let 
<5 = e/(2MN). If ||P|| < <5, write 


U(P 9 f) = £ M k (f) Ax k = 5, 4- S 2 , 

where is the- sum of terms arising from those subintervals of P containing no points of 
P e and S 2 is the sum of the remaining terms. Then 

S t < U(P £9 f) < I + e/ 2 and S 2 < NM\\P\\ < NMS = e/2, 
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and hence U(P, f) < I + e. Similarly, 

L(P,f) > I — e if ||P|| < S' for some S\ 

Hence |S(P,/) - 7| < e if ||P|| < min (S, S'). 

7.5 Let {a n } be a sequence of real numbers. For x > 0, define 

ixl 

a < x ) = E ■ E a »> 

n£x n= 1 

where [a:] is the greatest integer in * and empty sums are interpreted as zero. Let /have 
a continuous derivative in the interval 1 < x < a. Use Stieltjes integrals to derive the 
following formula: 

yi a nf(n) - - j A(x)f'(x) dx + A(a)f(a). 

nsa J l 


7.6 Use Euler’s summation formula, or integration by parts in a Stieltjes integral, to 
derive the following identities: 


x 1 1 PM., 

a k s /I s ” 1 s .x s+1 d x s ^ 

"1 P n 

b > E k = lo 8 n - 

fc= 1 * Jl 


jc - M 

2 dx + 1. 


7.7 Assume/' is continuous on [1, 2n] and use Euler’s summation formula or integra- 
tion by parts to prove that 

E i-ifm = fV'ocxw - 2 fjc/2 ]) 

fc=i Ji 

7.8 Let ^(x) = x - [x] - i if x ? integer, and let (p^x) - 0 if x = integer. Also, 

let <p 2 (x) = Jo <Pi(t) dt. If/* is continuous on [1, «] prove that Euler’s summation 
formula implies that 

£/<*)- f/w*- r . 

k= i Ji Ji 2 

7.9 Take /(x) = log x in Exercise 7.8 and prove that 

log nl = (n + i) log n - n + 1 + f dt. 

Ji t 2 

7.10 If x > 1, let n(x) denote the number of primes <x, that is, 

*(*) = E 

PSX 

where the sum is extended over all primes p < x. The prime number theorem states that 

lim n(x) — = 1. 


x 
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This is usually proved by studying a related function & given by 


3(x) = ^ log p. 


P*X 

where again the sum is extended over all primes p < x. Both functions n and & are step 
functions with jumps at the primes. This exercise shows how the Riemann-Stieltjes 
integral can be used to relate these two functions. 

a) If x > 2, prove that n(x) and &(x) can be expressed as the following Riemann- 
Stieltjes integrals: 


**)- r 

J 3/2 


log t dn(t), n(x) = 


- r 

J 3/2 log t 


dm . 


note. The lower limit can be replaced by any number in the open interval (1, 2). 

b) If x > 2, use integration by parts to show that 


&(x) = n(x) log x 


«(«) = + 
log X 


- r ^ di, 

r ™ dt. 

J 2 t log 2 t 


These equations can be used to prove that the prime number theorem is equivalent 
to the relation lim^oo &(x)/x =1. 


7.11 If cl/ on [a, b], prove that we have 


r 

1 


>b /»c Pb 

a) | / dot. = / doc + / d<x 9 (a < c < b), 

Ja Jc 


b) I (/ + g) da. 

>a 


I 


-r 

i 


f da + | g da, 

a 


c) I (/ + g) dx > I fd<x+ I g da. 
'a Ja Ja 


[ 

i 


7.12 Give an example of a bounded function / and an increasing function a defined on 
[a, b] such that |/| e R( a) but for which fj / da does not exist. 

7.13 Let a be a continuous function of bounded variation on [a, b]. Assume g e R( a) 
on [a, b] and define fi(x) = g(t) da(t) if x e [a, b]. Show that: 

a) If f/ on [a, b], there exists a point x 0 in [a, b] such that 


I 


f dfi = f(a ) f g da. + f(b) f g da. 

Ja Jx 0 


b) If, in addition, /is continuous on [a, b], we also have 


L 


f(x)g(x) da(x) = /(a) f g da + f{b) f g da. 
a Ja Jx 0 
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7.14 Assume/ e /£(a) on [a, b ], where a is of bounded variation on [a, b], Let V(x) 
denote the total variation of a on [a, x] for each x in ( a , b], and let V(a) = 0. Show that 



< 



dV < MV{b\ 


where M is an upper bound for |/| on [a, b]. In particular, when a(x) = x, the inequality 
becomes 


r 


f(x) dx 


< M(b — a). 


7.15 Let {a B } be a sequence of functions of bounded variation on [a, b]. Suppose there 
exists a function a defined on [a, b ] such that the total variation of a - a„ on [a, b] tends 
to 0 as n -* oo. Assume also that a (a) = a„(a) = 0 for each n = 1,2,... If / is con- 
tinuous on [a, b\ prove that 

lim f f(x) da n {x) = f f(x) da(x). 

"-* 00 Ja Ja 

7.16 If/e R(<x),f 2 e R(a), g e R(a), and g 2 e R(a) on [a, b], prove that 


if[£ 


fix) g(x) 

fiy) giy) 


da(y) I da(x) 


•] 

“(C 


•)tt 


f(x ) 2 doc(x) i( g(x ) 2 doc(x) 


) - (I 


f(x)g(x) da(x) 


) 


When a/ on [a, 6], deduce the Cauchy-Schwarz inequality 


(/: 


f(x)g(x) da(x) 


r - (/: 


')(/. 


/(*) 2 </a(x)| I #(x) 2 </a(x) 


) 


(Compare with Exercise 1.23.) 


7.17 Assume that / e R(a), g e R(a), and /• g e /J(a) on [a, A]. Show that 

^ J IT ~ /W)^^) - g(x)) da{y)\ da(x) 

= (a(Z>) - a(a)) J f{x)g(x) da{x) - ^ J /(x) da(x)j ^ |* g(x) da(jc)j 


If cl/ on [a, 6], deduce the inequality 


/(*) <**(*) j #(*) dct{x)^ < (a(6) - a(a)) J f(x)g(x) doc(x) 


when both / and g are increasing (or both are decreasing) on [a, b]. Show that the reverse 
inequality holds if /increases and g decreases on [a, b]. 
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7.18 Assume f e Ron [a, b]. Use Exercise 7.4 to prove that the limit 

lim ? y fla + k — — \ 

"-*• 00 n hi \ n / 

exists and has the value i„f(x) dx. Deduce that 


lim T! TT~ — 2 = 7 » lim V (« 2 + * 2 ) 1/2 = lo 8 0 + ^>- 

n->oo k + n 4 n->oo 


7.19 Define 





e -xHt2+i) 

t 2 + 1 


dt. 


a) Show that g'(x) + f'(x) = 0 for all x and deduce that g( x) + f(x) = 7r/4. 

b) Use (a) to prove that 


lim f e * 2 dt = - y/n. 

X-+00 Jo 2 

7.20 Assume g e R on [a, b] and define f(x) = g(t) dt if x e [a, b]. Prove that the 
integral JJ \g(t)\ dt gives the total variation of / on [a, x]. 

7.21 Let f = (/i, • • . ,/J be a vector-valued function with a continuous derivative f' on 
[a, b], Prove that the curve described by f has length 

A f (a, b ) = f ||f'(OII dt. 

7.22 If / (#I+1) is continuous on [a, jc], define 

I„{X) = 1 p(x - t) n f in+1 \t) dt. 

n\ Ja 

a) Show that 

/jk-iW - h(x) = — — 7 - , k = 1 , 2 , . . . , n. 

k\ 


b) Use (a) to express the remainder in Taylor’s formula (Theorem 5. 1 9) as an integral. 

7.23 Let /be continuous on [0, a\. If x e [0, a], define f 0 (x) = f(x) and let 

/.+ iW = — f (x - t) n m dt, n = 0, 1. 2, . . . 

n\ Jo 

a) Show that the nth derivative of/, exists and equals f. 

b) Prove the following theorem of M. Fekete: The number of changes in sign of / 
in [0, a] is not less than the number of changes in sign in the ordered set of 
numbers 

/(*), fi /,(*)• 

Hint. Use mathematical induction. 
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c) Use (b) to prove the following theorem of L. Fej6r: The number of changes in 
sign of/in [0, a] is not less than the number of changes in sign in the ordered set 


/( 0 ), 


f/(0 dt 9 P tf(t ) dt 9 . . . , f /"/(/) dt. 

Jo Jo Jo 


7.24 Let /be a positive continuous function in [a 9 b ]. Let M denote the maximum value 
of/on [a 9 b ]. Show that 


a b \ 1/n 

f(x) n dx\ = M. 


n-+ 


7 


7.25 A function /of two real variables is defined for each point (jc, y) in the unit square 
0<x<l,0<y<las follows: 


/(*, /> - L; 


if x is rational, 
if x is irrational. 


a) Compute Jo f(x, y) dx and Jo /(jc, y) dx in terms of y. 

b) Show that Jo f(x, y) dy exists for each fixed x and compute Jo /(jc, y) dy in terms 
of x and / for 0 < x < 1,0 < t < 1. 

c) Let F(x) = Jo f(x 9 y) dy. Show that Jo F(x) dx exists and find its value. 

7.26 Let /be defined on [0, 1 ] as follows: /( 0) = 0; if 2"”"” 1 < jc < 2”", then /( jc) = 2~", 
for n = 0, 1, 2, . . . 

a) Give two reasons why Jo /(jc) dx exists. 

b) Let F(x) = Jo /(/) dt. Show that for 0 < jc < 1 we have 

F(x) = xA(x) - i^(x) 2 , 

where A(x) = 2” c ” l0gx/l082] and where [y] is the greatest integer in y. 

7.27 Assume / has a derivative which is monotonic decreasing and satisfies /'(jc) > 
m > 0 for all jc in [a, b]. Prove that 


X 


cos /(jc) dx 


m 


Hint. Multiply and divide the integrand by /'(jc) and use Theorem 7.37(ii). 

7.28 Given a decreasing sequence of real numbers {G(/i)} such that G(n) -> 0 as n -> oo. 
Define a function /on [0, 1 ] in terms of {(/(«)} as follows : /(0) = 1 ; if jc is irrational, then 
f(x) = 0; if x is the rational mjn (in lowest terms), then f(m/n) = G(n). Compute the 
oscillation co f (x) at each jc in [0, 1 ] and show that f e R on [0, 1 ]. 

7.29 Let /be defined as in Exercise 7.28 with G(n) = 1/n. Let g(x) =lif0<jc<l, 
^(0) = 0. Show that the composite function h defined by h(x) = g\J{x)] is not Riemann- 
integrable on [0, 1 ], although both feR and g e Ron [0, 1 ]. 

7.30 Use Lebesgue’s theorem to prove Theorem 7.49. 

7.31 Use Lebesgue’s theorem to prove that if feR and g e R on [a 9 b] and if /(jc) > 
m > 0 for all jc in [a 9 b] 9 then the function h defined by 

h( jc) = f(x) gix) 


is Riemann-integrable on [a 9 b]. 
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7.32 Let / = [0, 1 ] and let A 1 = I - (J, f ) be that subset of / obtained by removing 
those points which lie in the open middle third of /; that is, A x = [0, i]u [|, 1]. Let 
A 2 be that subset of A 1 obtained by removing the open middle third of [0, j] and of 
[f, 1 ]. Continue this process and define A 3 , A 4 , . . . The set C = A„ is called the 
Cantor set. Prove that: 

a) C is a compact set having measure zero. 

b) x e C if, and only if, x = where each a n is either 0 or 2. 

c) C is uncountable. 

d) Let f(x) = 1 if x e C,/(jc) = 0 if x $ C. Prove that f e Ron [0, 1 ]. 

7.33 This exercise outlines a proof (due to Ivan Niven) that n 2 is irrational. Let /( jc) = 
x n (\ - x) n / n\. Prove that: 

a) 0 < f(x) < l/n\ if 0 < jc < 1. 

b) Each kth derivative / (fc) (0) and / (fc) ( 1) is an integer. 

Now assume that n 2 = a/b , where a and b are positive integers, and let 


Prove that : 



v — 


(_1)*/C2*) (jc) n 2*-2k 


c) F(0) and F(l) are integers. 

d) 7 i 2 a n f(x) sin nx = — { F'(x ) sin 7tx — nF(x) cos 7 ix}. 

dx 


e) F(l) + F(0) = na n 



f(x) sin 7ix dx. 


f) Use (a) in (e) to deduce that 0 < F(l) + F(0) < 1 if n is sufficiently large. This 
contradicts (c) and shows that n 2 (and hence n) is irrational. 

7.34 Given a real-valued function a, continuous on the interval [a, b] and having a finite 
bounded derivative a' on ( a , b). Let /be defined and bounded on [a, b ] and assume that 
both integrals 

f f(x) dot(x) and f /(jc) ol'(x) dx 
Ja Ja 

exist. Prove that these integrals are equal. (It is not assumed that a' is continuous.) 

7.35 Prove the following theorem, which implies that a function with a positive integral 
must itself be positive on some interval. Assume that f e Ron [a, b] and that 0 < /(jc) < 
M on [a, b], where M > 0. Let I = Saf( x ) d x > let A = i//(M + b — a\ and assume 
that / > 0. Then the set T = {x : f(x) > h} contains a finite number of intervals, the 
sum of whose lengths is at least h. Hint . Let P be a partition of [a, b] such that every 
Riemann sum S(P,/) = £Z=i f(t k ) Ax k satisfies S(P,/) > 7/2. Split S(P,/) into two 
parts, S(P, /) = YkeA + ZfceB, where 


A = {k : [**_!, x k \ £ T}, and B = {k : k$ A). 

If k e A, use the inequality f(t k ) < M; if k e B, choose t k so that f(t k ) < h. Deduce that 
Zfce.4 ^ x k > A. 
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Existence theorems for integral and differential equations 

The following exercises illustrate how the fixed-point theorem for contractions (Theorem 
4.48) is used to prove existence theorems for solutions of certain integral and differential 
equations. We denote by C[a 9 b] the metric space of all real continuous functions on 
[a 9 b] with the metric 


d{f, g) = ||/ - g || = max \f{x) - g{x)\, 

a&x&b 


and recall that C[a, b] is a complete metric space (Exercise 4.67). 

7.36 Given a function g in C[a 9 b] 9 and a function K continuous on the rectangle 
Q = [a, b] x [a, b] 9 consider the function T defined on C[a 9 b] by the equation 


T(tp)(x) = g(x ) + A 



K(x 9 t)q>(t) dt 9 


% 


where A is a given constant. 


a) Prove that T maps C[a 9 b] into itself. 

b) If \K(x 9 y) | < M on Q 9 where M > 0, and if |A| < M~ l (b — a)” 1 , prove that 
Tis a contraction of C[a 9 b] and hence has a fixed point tp which is a solution of 
the integral equation q>(x) = g(x) + A fJJ K(x 9 t)<p(t) dt. 

7.37 Assume / is continuous on a rectangle Q = [a — h 9 a + h] x [b — k 9 b + k] 9 
where h > 0, k > 0. 


a) Let (p be a function, continuous on [a — h 9 a + h] 9 such that (x 9 tp(x)) e Q for 
all x in [a — h 9 a + h]. If 0 < c < h 9 prove that q> satisfies the differential 
equation y' = f(x 9 y) on (a — c 9 a + c) and the initial condition <p(a) = b if, 
and only if, tp satisfies the integral equation 


tp(x) = b + J f(t 9 <p(t)) dt on {a — c 9 a + c). 

b) Assume that | f(x 9 y)\ < M on Q 9 where M > 0, and let c = min {h 9 k/M}. 
Let S denote the metric subspace of C[a — c 9 a + c] consisting of all tp such 
that | tp{x) — b\ < Me on [a — c 9 a + c]. Prove that S is a closed subspace of 
C[a — c 9 a + c] and hence that S is itself a complete metric space. 

c) Prove that the function T defined on S by the equation 


maps S into itself. 


T(tp)(x) = b + 



dt 


d) Now assume that / satisfies a Lipschitz condition of the form 


/(*, y) - /(*, z)\ < A\y - z 


for every pair of points ( x 9 y) and ( x 9 z) in Q 9 where A > 0. Prove that T is a 
contraction of S if h < 1 1 A. Deduce that for h < 1/ A the differential equation 
y = f(x 9 y) has exactly one solution y = q>(x) on (a — c 9 a + c) such that 
<p(a) = b. 
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CHAPTER 8 


INFINITE SERIES 
AND INFINITE PRODUCTS 


8.1 INTRODUCTION 

This chapter gives a brief development of the theory of infinite series and infinite 
products. These are merely special infinite sequences whose terms are real or 
complex numbers. Convergent sequences were discussed in Chapter 4 in the setting 
of general metric spaces. We recall some of the concepts of Chapter 4 as they apply 
to sequences in C with the usual Euclidean metric. 

8.2 CONVERGENT AND DIVERGENT SEQUENCES OF COMPLEX NUMBERS 

Definition 8.1. A sequence {a n } of points in C is said to converge if there is a point p 
in C with the following property: 

For every e > 0 there is an integer N ( depending on e) such that 

\a n — p\ < e whenever n > N. 

If {a„} converges to p, we write lim,,^^ a n = p and call p the limit of the sequence. 
A sequence is called divergent if it is not convergent. 

A sequence in C is called a Cauchy sequence if it satisfies the Cauchy condition', 
that is, for every e > 0 there is an integer N such that 

| a n — a m \ < e whenever n > N and m > N. 

Since C is a complete metric space, we know from Chapter 4 that a sequence in C 
is convergent if, and only if, it is a Cauchy sequence. 

The Cauchy condition is particularly useful in establishing convergence when 
we do not know the actual value to which the sequence converges. 

Every convergent sequence is bounded (Theorem 4.3) and hence an unbounded 
sequence necessarily diverges. 

If a sequence {<*„} converges to p, then every subsequence {a tn } also converges 
to p (Theorem 4.5). 

A sequence {<*„} whose terms are real numbers is said to diverge to +oo if, 
for every M > 0, there is an integer N (depending on M) such that 

a H > M whenever n > N. 

In this case we write lim,.,^ a n = + oo. 

If lim,,.^ (— a„) = + oo, we write lim,..^ a n = — oo and say that {<*„} diverges 
to — oo. Of course, there are divergent real-valued sequences which do not diverge 
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to + oo or to — oo. For example, the sequence {(- 1)"(1 + l/n)} diverges but does 
not diverge to + oo or to — oo. 

8.3 LIMIT SUPERIOR AND LIMIT INFERIOR OF A REAL-VALUED SEQUENCE 

Definition 8.2. Let {a„} be a sequence of real numbers. Suppose there is a real 
number U satisfying the following two conditions: 

i) For every e > 0 there exists an integer N such that n > N implies 

a„ < U + e. 

ii) Given e > 0 and given m > 0, there exists an integer n > m such that 

a n > U — e. 

Then U is called the limit superior (or upper limit) of {a n } and we write 

U = lim sup a„. 

n-+oo 

Statement (i) implies that the set {a u a l9 . . . } is bounded above. If this set is not 
bounded above , we define 

lim sup a n = -f oo. 

n~* oo 

If the set is bounded above but not bounded below and if {a n } has no finite limit 
superior , then we say lim sup,,.^ a n = — oo. The limit inferior (or lower limit) of 
{a n ) is defined as follows: 

lim inf a„ = —lim sup b n , where b„ = —a n for n = 1 , 2, . . . 

n-Kx> n-+ oo 

note. Statement (i) means that ultimately all terms of the sequence lie to the left 
of U + e. Statement (ii) means that infinitely many terms lie to the right of U — e. 
It is clear that there cannot be more than one U which satisfies both (i) and (ii). 
Every real sequence has a limit superior and a limit inferior in the extended real 
number system R*. (See Exercise 8. 1 .) 

The reader should supply the proofs of the following theorems: 

Theorem 8.3. Let {a„} be a sequence of real numbers. Then we have: 

a) lim inf^^ a n < lim sup,,^ a„. 

b) The sequence converges if, and only if lim sup,,-*, a„ and lim inf,,-* a n are both 
finite and equal, in which case lim n _ 00 a„ = lim inf,,-* a n = lim sup n -* a„. 

c) The sequence diverges to + oo if, and only if, lim inf,,-* a n = lim sup,,^ a n = 

oo. 

d) The sequence diverges to — oo if, and only if, lim inf,,^ a„ = lim supn-.^, a n = 

— 00 . 
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note. A sequence for which lim inf*.**, a n ^ lim sup n _ 00 a n is said to oscillate. 
Theorem 8.4. Assume that a n < b n for each n — 1,2,... Then we have: 


lim inf a n < lim inf b n 

n~*ao n-+ oo 


and lim sup a„ < lim sup b n . 

n~* oo n~* oo 


Examples 

1. = (-ira + m, 

2. a„ = (-if, 

3. a„ = (-1 )"«, 

4. a n = n 2 sin 2 ($nn). 


lim infu^a, a n = -1, 
lim inf^a, a n = -1, 
lim inf^^oo a n = -oo, 
lim infn^oo a„ = 0, 


limsupa„ = +1. 
lim sup n _, qo a n = +1. 
lim suPd-icq a n = + oo. 
lim sup a n = + oo . 


8.4 MONOTONIC SEQUENCES OF REAL NUMBERS 

Definition 8.5. Let {a„} be a sequence of real numbers. We say the sequence is 
increasing and we write a n S if a n <, a n+1 for n = 1, 2, . . . If a„> a n+1 for all n, 
we say the sequence is decreasing and we write a n \. A sequence is called monotonic 
if it is increasing or if it is decreasing. 

The convergence or divergence of a monotonic sequence is particularly easy 
to determine. In fact, we have 

Theorem 8.6. A monotonic sequence converges if, and only if, it is bounded. 

Proof. If a n s, lim^^ a n = sup {a n : n = 1, 2, . . . }. If a n \, lim^^ a„ = 
inf {a n : n = 1,2,...}. 


8.5 INFINITE SERIES 

Let {a„} be a given sequence of real or complex numbers, and form a new sequence 
{ $„} as follows : 

n 

S„ = + • • • + a n = ^2 a k (« = 1,2,...). (1) 

k=l 

Definition 8.7. The ordered pair of sequences ({a n }, {*?„}) is called an infinite series . 
The number s n is called the nth partial sum of the series . The series is said to con- 
verge or to diverge according as {s n } is convergent or divergent . The following 
symbols are used to denote the series defined by (1): 

oo 

a i + a 2 + ' ' ' + a„ + • • • , a t + a 2 + a 3 + , 

k= 1 

note. The letter k used in Y ] a k is a “dummy variable” and may be replaced 
by any other convenient symbol. If p is an integer >0, a symbol of the form 
Z*=p b„ is interpreted to mean a n where a„ = 6 B+P _!. When there is no 
danger of misunderstanding, we write ffb* instead of Y ”_ r b n . 
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If the sequence {s n } defined by (1) converges to s 9 the number s is called the sum 
of the series and we write 

s = £a, 

k=l 

Thus, for convergent series the symbol Yfi k is used to denote both the series and 
its sum. 

Example. If x has the infinite decimal expansion x = a 0 .aia 2 • • • (see Section 1.17), then 
the series a k 10 -k converges to x. 

Theorem 8.8. Let a = Yfln an< ^ b — YJb„ be convergent series. Then, for every 
pair of constants a and /?, the series 5Xaa„ + pb„) converges to the sum a a + fib. 
That is, 

00 00 00 

£ ( aa » + Pb„) = a 2 a « + P £ b n . 

n— 1 n= 1 n= 1 

Proof Z2=i (a a k + $b k ) = a X£=i a k + 0 YUi b k . 

Theorem 8.9. Assume that a„ > 0 for each n = 1,2,... Then f^a n converges if, 
and only if, the sequence of partial sums is bounded above. 

Proof. Let s n = a k + • • • + a. Then s„ s and we can apply Theorem 8.6. 

Theorem 8.10 (Telescoping series). Let {a„} and {b„} be two sequences such that 
a„ = b n+1 — b„for n = 1, 2, . . . Then Yfl„ converges if, and only if, lim„^ 00 b n 
exists, in which case we have * 


£ 


lim b n — b t . 

n-* oo 


Proof. 23-i a k = ZZ=i (b k+ i - b k ) = b n+1 - b v 


Theorem 8.11 ( Cauchy condition for series). The series £a„ converges if, and only 
if for every e > 0 there exists an integer N such that n > N implies 


l°«+i + • • • + a n+p J < e for eachp = 1, 2, . . . (2) 

Proof. Let s„ = 23 = i a k, write s n+p - s n = a n+1 + •• • + a n+p , and apply 
Theorem 4.8 and Theorem 4.6. 

Taking p = 1 in (2), we find that lim,,.,* a n = 0 is a necessary condition for 
convergence of £a„. That this condition is not sufficient is seen by considering the 
example in which a„ = 1 In. When n = 2 m and p = 2 m in (2), we find 


a 


B+l 


+ 


+ a n+p = 


1 


2 m + 1 


+ 


+ 


1 


im 


l 


2 ” + 2 ” 2 m + 2 


m 


and hence the Cauchy condition cannot be satisfied when e < %. Therefore the 
series Y”_ 1 1/n diverges. This series is called the harmonic series. 
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8.6 INSERTING AND REMOVING PARENTHESES 

Definition 8.12. Let p be a function whose domain is the set of positive integers and 
whose range is a subset of the positive integers such that 

i) p(n) < p(m), if n < m. 

Let Y, a n and be two series related as follows: 

b t = + a 2 + • • • + a p(1) , 

ii) b H +i — a p(n)+i + a p(n)+2 + ‘ + a p(i»+i)» if n = l, 2 , . . . 

Then we say that Y.b n is obtained from Y. a „ by inserting parentheses, and that is 
obtained from by removing parentheses. 

Theorem 8.13. If J ja» converges to s, every series Y.b n obtained from ^a n by in- 
serting parentheses also converges to s. 

i 

Proof Let and YJb„ be related by (ii) and write s„ = ZZ=i a k , t n = £Z=i b k . 
Then {/„} is a subsequence of {$„}. In fact, t„ = s p(n) . Therefore, convergence of 
{j„} to s implies convergence of {t„} to s. 

Removing parentheses may destroy convergence. To see this, consider the 
series J^b„ in which each term is 0 (obviously convergent). Let p(n) = 2n and let 
a„ = (— 1)". Then (i) and (ii) hold but diverges. 

Parentheses can be removed if we further restrict ]£a n and p. 

Theorem 8.14. Let Xla„, Yf> n be related as in Definition 8.12. Assume that there 
exists a constant M > 0 such that pip + 1) — p(ri) < M for all n, and assume that 
lim n _ 00 a n = 0. Then Yfin converges if, and only if, YJb n converges, in which case 
they have the same sum. 

Proof. If Y. a n converges, the result follows from Theorem 8.13. The whole 
difficulty lies in the converse deduction. Let 

s n = «!+••• + a„, t n = b k + • • • + b„, t = lim t n . 

n~* oo 

Let e > 0 be given and choose N so that n > N implies 

1 ~ t\ < | and |a„| < 

2 2 M 

If n > p(N), we can find m > N so that N <, p(m) < n < p(m + 1). [Why?] 
For such n, we have 

= ‘ ‘ " "I” Clp(m+ 1) (®n+ 1 “1" ®n + 2 "b * " ' “b ^p(m+l)) 

= f m + 1 — ( a n + l + a n + 2 + ••• + <*p(m+l))> 
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and hence 

|s„ — t\ < | t m+1 — 1 1 + | a n+1 -b a n+2 + ••• + a p ( m +i)\ 

— l^m+l — *1 + l a p(m)+ll + I ^p(m) + 2 1 + ’*’ + l fl p(m+l)l 

< | • + (K" 1 + 1) - P("0) ^ | | = e- 

This proves that lim n _ 00 = t. 

8.7 ALTERNATING SERIES 

Definition 8.15. If a n > 0 /or eacA n, the series Yf-, (— 1)" +1 a„ is called an 
alternating series. 

Theorem 8.16. If {a„} is a decreasing sequence converging to 0, the alternating 
series £(— l) n+1 a n converges. If s denotes its sum and s„ its nth. partial sum, we have 
the inequality 

0 < (- l)%y - $„) < a n+1 , forn = 1,2,... (3) 

note. Inequality (3) tells us that when we “approximate” s by s n , the error made 
has the same sign as the first neglected term and is less than the absolute value of 
this term. 

Proof. We insert parentheses in £(— l) n+1 o n , grouping together two terms at a 
time. That is, we take p(n) = 2 n and form a new series Yf>„ according to Definition 
8.12, with 

= di — a 2 , b 2 = a 3 — a 4 , . . . , b n = a 2 »-i — a 2n- 

Since a n ~* 0 and p{n + 1) — pin) = 2, Theorem 8.14 tells us that £(— 1)" +1 a n 
converges if Yf> n converges. But YJ>„ is a series of nonnegative terms (since a n \ ), 
and its partial sums are bounded above, since 

n 

22 h = a t - (a 2 - a 3 ) ( a 2n - 2 - a 2n -i) - a 2 „ < <*i- 

k=l 

Therefore Yf>„ converges, so £( — 1)" +1 a n also converges. 

Inequality (3) is a consequence of the following relations : 

oo oo 

(-l)"(s - s„) = ^2 (-lf +1 a n+k = (a n+2 /t-i - a n+2k ) > 0, 

* = 1 k= 1 

oo 

( I )”( 5 — S n) = a n+ 1 — ( a n + 2k ~ a n + 2k+l) < a #i+l* 

k= 1 


and 
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8.8 ABSOLUTE AND CONDITIONAL CONVERGENCE 

Definition 8.17. A series Ifln is called absolutely convergent i/Sl a «l converges. It 
is called conditionally convergent ifY. a „ converges but £|a„| diverges. 

Theorem 8.18. Absolute convergence ofYfln implies convergence. 

Proof. Apply the Cauchy condition to the inequality 

k+1 + ••• + a n+ p I < |a„ +1 | + ••• + \a tt+p \. 


To see that the converse is not true, consider the example 

n= 1 n 

This alternating series converges, by Theorem 8.16, but it does not converge 
absolutely. 

Theorem 8.19. Let £a„ be a given series with real-valued terms and define 



Pn = 


Wn\ + «» 


<ln = 


*n 


aj — a 


n 


(n = 1,2,...). 


( 4 ) 


Then: 

i) IfYfln I s conditionally convergent, both £/>„ and Y.q n diverge. 

ii) IfY\a n \ converges, both "£.p n and Y.q n converge and we have 


00 00 00 

Pn Qn' 

n= 1 n— 1 n — 1 

note. p n = a n and q n = 0 if a n > 0, whereas g „ = —a n and p n = 0 if a n < 0. 

Proof. We have a„ = p„ - q n , \a n \ = p n + q n . To prove (i), assume that £>„ 
converges and £|a„| diverges. If Jjq n converges, then Y.P n also converges (by 
Theorem 8.8), since p„ = a n + q„. Similarly, if Y.p n converges, then also 
converges. Hence, if either £/?„ or ^jq n converges, both must converge and we 
deduce that £|a„| converges, since \a n \ = p n + q n . This contradiction proves (i). 
To prove (ii), we simply use (4) in conjunction with Theorem 8.8. 


8.9 REAL AND IMAGINARY PARTS OF A COMPLEX SERIES 

Let £c„ be a series with complex terms and write c n = a n + ib„, where a n and b n 
are real. The series £a„ an d Yfi. are called, respectively, the real and imaginary 
parts of £c„. In situations involving complex series, it is often convenient to treat 
the real and imaginary parts separately. Of course, convergence of both £a„ and 
fjb n implies convergence of £c„. Conversely, convergence of £c„ implies con- 
vergence of both Y. a „ and YJb n - The same remarks hold for absolute convergence. 
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However, when £c„ is conditionally convergent, one (but not both) of £a„ and 
YJb„ might be absolutely convergent. (See Exercise 8.19.) 

If £ c „ converges absolutely, we can apply part (ii) of Theorem 8.19 to the real 
and imaginary parts separately, to obtain the decomposition. 

E C n = L(Pn + *'«„) - L(<?n + iv n ), 
where £/?„, £q n > X v n are convergent series of nonnegative terms. 


8.10 TESTS FOR CONVERGENCE OF SERIES WITH POSITIVE TERMS 

Theorem 8.20 (Comparison test). If a„ > 0 and b n > 0 for n = 1, 2, ... , and 
if there exist positive constants c and N such that 

a n < cb n for n > N, 

then convergence ofYJb n implies convergence ofYfln- 

Proof The partial sums of are bounded if the partial sums of£6 B are bounded. 

By Theorem 8.9, this completes the proof. 

Theorem 8.21 (Lindt comparison test). Assume that a„ > 0 and b n > 0 for 
n = 1 , 2 ,..., and suppose that 


lim — = 1. 

B-+00 b n 

Then Yfl n converges if, and only if, yjb n converges. 

Proof There exists N such that n > N implies i < ajb„ < f. The theorem fol- 
lows by applying Theorem 8.20 twice. 

note. Theorem 8.21 also holds if lim n _ 00 ajb n = c, provided that c^O. If 
lim B ^ 00 ajb n = 0, we can only conclude that convergence of Y.K implies con- 
vergence of 2>„. 

8.11 THE GEOMETRIC SERIES 

To use comparison tests effectively, we must have at our disposal some examples of 
series of known behavior. One of the most important series for comparison 
purposes is the geometric series. 

Theorem 8.22. If \x\ < 1, the series 1 + x + x 2 + • • ■ converges and has sum 
1/(1 — x). If\x\ > l, the series diverges. 

Proof. (1 — x) yi =n x* = £J! = o (** — x k+1 ) = 1 — x n+1 . When |x| < 1, we 
find lim B ^ Q0 x" +1 = 0. If |x| > 1, the general term does not tend to zero and the 
series cannot converge. 
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8.12 THE INTEGRAL TEST 

Further examples of series of known behavior can be obtained very simply by 
applying the integral test. 


Theorem 8.23 (Integral test). Let f be a positive decreasing function defined on 
[1, + oo) such that lim^ + x f{x) = 0. For n = 1,2,..., define 


*=i 


'n 

(„ = 

Ji 


f(Jt), tn | fipt) dx, d n S„ t n . 


Then we have: 


i) 0 < f(n + 1) < d n+1 <, d„ < /( 1), forn = 1, 2, . . . 

ii) lim,,.,^ d„ exists. 

iii) 2“=x fin) converges if, and only if, the sequence {?„} converges. 

iv) 0 < d k - lim,,^ d„ < f(k), for k = 1,2,... 

Proof. To prove (i), write 


f n+l 

I 


f(x) dx 


= t r 

*-» jk 


f(x) dx 


* t r 

k=l Jk 


f(k) dx 


ww 

= E m = 


*=i 


This implies that f(n + 1) = j s+1 — s„ < s n+1 — t n+l = d n+1 , and we obtain 


But we also have 


0 <f(n+ 1) < d n+1 . 


rn+1 

d n - d H+1 = t n+1 - t„ - (s n+l - s„) = I f{x) dx - fin + 1) 


(5) 


( *«+ 1 

n 


fin + 1) dx - fin + 1) = 0, 


and hence d n+1 < d„ < d t = /(l). This proves (i). But now it is clear that (i) 
implies (ii) and that (ii) implies (iii). 

To prove part (iv), we use (5) again to write 


0 < d„ - d n+1 < 
Summing on n, we get 


rn+1 


fin) dx - fin + 1) = fin) - fin + 1). 


oo 


oo 


0 < £ (d n - d n+1 ) < X! (/(*) - /(« + 1)), if k > 1. 

n=k n=k 

When we evaluate the sums of these telescoping series, we get (iv). 
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note. Let D = d„. Then (i) implies 0 < D < /(l), whereas (iv) gives us 

/(£) ~ f fix) dx - D < f{n). (6) 

This inequality is extremely useful for approximating certain finite sums by 
integrals. 


o<i; 

v — 1 


8.13 THE BIG OH AND LITTLE OH NOTATION 


Definition 8.24 . Given two sequences {a n } and {b n } such that b n > 0 for all n. We 
write 

a n = 0(b„) (read: “a n is big oh of b”), 

if there exists a constant M > 0 such that \a n \ < Mb n for all n. We write 

\ 

a n = o(b n ) as n -* oo (read: “a n is little oh of b”), 
if Hindoo ajb„ = 0. 


note. An equation of the form a„ = c n + 0(b„) means a n — c n = 0(b„). Sim- 
ilarly, a„ = c„ + o(b„) means a n — c n = o(b„). The advantage of this notation 
is that it allows us to replace certain inequalities by equations. For example, (6) 
implies 


£ m = 


k= 1 


fix) dx + D + 0(f(n)). 

1 



Example 1. Let f(x) = l/x in Theorem 8.23. We find t n = log n and hence £l jn 
diverges. However, (ii) establishes the existence of the limit 


lim (Z 7 " lo * n ) ’ 

n -co \£=i k } 

a famous number known as Euler's constant, usually denoted by C (or by y). Equation (7) 
becomes 

gi = Iog, , + C+oQ. (8) 


Example 2. Let f(x) — x~ s , s ^ 1, in Theorem 8.23. We find that ^jt~ s converges if 
s > 1 and diverges if s < 1. For s > 1, this series defines an important function known 
as the Riemann zeta function : 

Cis) = £ ^ (* > i). 

n= 1 n 

For s > 0, s ^ 1, we can apply (7) to write 


E 

v— 1 


k 5 


n 1 -* - 1 
1 — s 


+ C(s) + O 



where C(s) = lim„^ 00 (^=i k~ s - (n l ~ s - 1)/(1 - s)). 
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8.14 THE RATIO TEST AND ROOT TEST 


Theorem 8.25 (Ratio test). Given a series 2 ja n of nonzero complex terms, let 


r = lim inf 

a n+ 1 

, R = lim sup 

a n+ 1 



n-*oo 

<*n 

n-* oo 

«» 


a) The series Y. a n converges absolutely if R < 1 . 

b) The series £a„ diverges if r > 1 . 

c) The test is inconclusive if r < 1 < R. 


Proof Assume that R < 1 and choose x so that R < x < 1 . The definition of R 
implies the existence of N such that \a n+l ja n \ < x if n > N. Since x = x n+1 jx n , 
this means that 


a n+ il \ a J < \g N 

x n+1 x n ~ x N ’ 


if n > N, 


and hence \a„\ < ex" if n > N, where c = |a,y|x N . Statement (a) now follows by 
applying the comparison test. 

To prove (b), we simply observe that r > 1 implies |a„ +1 | > \a n \ for all n > N 
for some N and hence we cannot have Iim rt _ >00 a n = 0. 

To prove (c), consider the two examples 2> -1 and 2> -2 . In both cases, 
r = R = 1 but diverges, whereas fji~ 2 converges. 

Theorem 8.26 (Root test). Given a series ^ja n of complex terms, let 

p = Iim sup 

n~* oo 

a) The series converges absolutely if p < 1 . 

b) The series diverges if p > 1 . 

c) The test is inconclusive if p = 1 . 

Proof Assume that p < 1 and choose x so that p < x < 1 . The definition of p 
implies the existence of N such that \a„\ < x n for n > N. Hence, 2!l« n | converges 
by the comparison test. This proves (a). 

To prove (b), we observe that p > I implies \a„\ > 1 infinitely often and 
hence we cannot have a n = 0. 

Finally, (c) is proved by using the same examples as in Theorem 8.25. 

note. The root test is more “powerful” than the ratio test. That is, whenever the 
root test is inconclusive, so is the ratio test. But there are examples where the ratio 
test fails and the root test is conclusive. (See Exercise 8.4.) 


8.15 DIRICHLET’S TEST AND ABEL’S TEST. 

All the tests in -the previous section help us to determine absolute convergence of a 
series with complex terms. It is also important to have tests for determining 
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convergence when the series might not converge absolutely. The tests in this 
section are particularly useful for this purpose. They all depend on the partial 
summation formula of Abel (equation (9) in the next theorem). 

Theorem 8.27. If {a n } and {&„} are two sequences of complex numbers , define 

A„ = «i +’•' + On- 

Then we have the identity 

n n 

23 a k^k = AAi+1 23 A k (b k + 1 — frfc)* (9) 

k=l k= 1 

Therefore , ]T£° =1 a kb k converges if both the series X)T=i ^ k (b k+ 1 — b k ) and the 
sequence {A n b n + X } converge . 

Proof Writing A 0 = 0, we have 

n n n n 

tt k b k (Aft Aft_i)bft ^ Afthft Aftbft +1 -I - j. 

k=l k=l k=l k= 1 

The second assertion follows at once from this identity. 

note. Formula (9) is analogous to the formula for integration by parts in a 
Riemann-Stieltjes integral. 

Theorem 8.28 (DiricMet’s test). Let Y. a „ be a series of complex terms whose partial 
sums form a bounded sequence. Let {£>„} be a decreasing sequence which converges 
to 0. Then J jaJ>„ converges. 

Proof. Let A„ = a, +••• + «„ and assume that \A„\ < M for all n. Then 

lim A„b„ +1 = 0. 

tt~* 00 

Therefore, to establish convergence of Y,a„b„ we need only show that 'E,A k (b k+ , — b k ) 
is convergent. Since b„ 'v , we have 

|Aft(hft+i — hft)| ^ hl(b k — b k+ ,). 

But the series Yfb k . M — b k ) is a convergent telescoping series. Hence the com- 
parison test implies absolute convergence of 'T.A k {b k+l — b k ). 

Theorem 8.29 (Abel's test). The series '$jajb„ converges if Yfln converges and if 
{b n } is a monotonic convergent sequence. 

Proof. Convergence of and of {£>„} establishes the existence of the limit 
A n b„ +l , where A„ = a t + ■ ■ ■ + a„. Also, {A„} is a bounded sequence. 
The remainder of the proof is similar to that of Theorem 8.28. (Two further tests, 
similar to the- above, are given in Exercise 8.27.) 
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8.16 PARTIAL SUMS OF THE GEOMETRIC SERIES ON THE 
UNIT CIRCLE |z| = 1 

To use Dirichlet’s test effectively, we must be acquainted with a few series having 
bounded partial sums. Of course, all convergent series have this property. The next 
theorem gives an example of a divergent series whose partial sums are bounded. 
This is the geometric series £z" with \z\ = 1, that is, with z = e ix where x is real. 
The formula for the partial sums of this series is of fundamental importance in the 
theory of Fourier series. 


Theorem 8.30. For every real x ^ 2mn ( m is an integer), we have 


tl 


e ikx = e ix 


1 — e 


inx 


k= 1 


1 — e 


IX 


sin (nxj 2) /(„ + i)jc /2 
sin (x/2) 


( 10 ) 


note. This identity yields the following estimate : 


n 


£ 

k= l 


Akx 


l 


sin (x/2)| 


( 11 ) 


Proof. (1 - <?■'*) EZ =1 e ikx = Z2 =1 (e ikx - e i(t+1)x ) = e ix - e i(n+l)x . This estab- 
lishes the first equality in (10). The second follows from the identity 


», 1 — e 


jinx Anx! 2 -inx/ 2 


jx 


A(n+ l)x/2 


1 _ e ix e ix!2 _ e -ixl 2 


note. By considering the real and imaginary parts of (10), we obtain 


n 


T. cos lex = sin — cos (« + 1) - / sin - 

f=i 22/2 


= — - + - sin (2n + 1) — / sin — , 
2 2 2 2 


( 12 ) 


n 


T { sin kx = sin sin ( n + 1) ~ / sin ~ 


(13) 


Using (10), we can also write 

n n 

^ e i(2k-t)x _ e ~ix e ik(2x) __ Sin YIX ^inx 

k = i sin x 



an identity valid for every x # mn (m is an integer). Taking real and imaginary 
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parts of (14) we obtain 


2 

Ir — 1 


cos (2k — l)x = 


sin 2 nx 
2 sin x 


99 

sin (2k — l)x = 


k = 1 


sin nx 


sin x 


Formulas (12) and (16) occur in the theory of Fourier series. 


(15) 

(16) 


8.17 REARRANGEMENTS OF SERIES 

We recall that Z + denotes the set of positive integers, Z + = {1, 2, 3, ... }. 

Definition 831 . Let f be a function whose domain is Z + and whose range is Z + , 
and assume that f is one-to-one on Z + . Let J ja n and YJb n be two series such that 

b n = <*f(n) forn= 1,2,... (17) 

Then ]Tb n is said to be a rearrangement of^a n . 

note. Equation (17) implies a n — b f - 1(n) and hence £ a n is also a rearrangement 


Theorem 832 . Le/ £a n be an absolutely convergent series having sum s. Then 
every rearrangement of^a n also converges absolutely and has sum s. 

Proof. Let { b n } be defined by (17). Then 

00 

1^1 1 + * * ‘ + |^J = \ a f(l)\ + • • * 4- |tf/(n)| ^ l^kl» 

k = 1 

so H I^J has bounded partial sums. Hence YJb n converges absolutely. 

To show that JJ> n = let t n = b t + • • • + b n , s n = a t + • • • + a n . Given 
e > 0, choose N so that \s N — s\ < e)2 and so that £k°=i |%+kl ^ s/2. Then 

I t n - s\ < I t n - s N \ + I s N - s\ < I t n - s N I + ^ . 

Choose M so that {1, 2, . . . , TV} c= {/(l),/(2), . . . ,/(Af)}. Then n > M implies 
f(n ) > N , and for such n we have 

\t n “ 5 nI = l^i + * ‘ ‘ + b„ — (a x + * • * + 0tf)l 

= \ a /(i) + ‘ ‘ + a f(n) — (a t + * * * + a N )\ 

^ l fl jv+il + |0;v+2l + *** ^ ~> 

since all the terms a l9 . . . , a N cancel out in the subtraction. Hence, n > M im- 
plies \t n — s\ < e and this means = s. 
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8.18 MEMANN’S THEOREM ON CONDITIONALLY CONVERGENT SERIES 

The hypothesis of absolute convergence is essential in Theorem 8.32. Riemann 
discovered that any conditionally convergent series of real terms can be rearranged 
to yield a series which converges to any prescribed sum. This remarkable fact is a 
consequence of the following theorem: 

Theorem 8.33 . Let J ja n be a conditionally convergent series with real-valued terms. 
Let x and y be given numbers in the closed interval [— oo, + oo], with x < y. Then 
there exists a rearrangement £ ]b n of^ja n such that 

lim inf t n =-x and lim sup t n = y , 

n~* oo n~* oo 

where t„ = b x + ••• + £>„. 

Proof. Discarding those terms of a series which are zero does not affect its con- 
vergence or divergence. Hence we might as well assume that no terms of '£a n are 
zero. Let p n denote the nth positive term of Yfln and let — q n denote its nth negative 
term. Then Y_p M and Yjln are both divergent series of positive terms. [Why?] 
Next, construct two sequences of real numbers, say {x„} and {y„}, such that 

lim x„ = x, lim y„ = y, with x n < y„, y t > 0. 

n~* go n-* oo 

The idea of the proof is now quite simple. We take just enough (say k t ) positive 
terms so that 

Pi + ' • • + Pkt > yu 

followed by just enough (say r t ) negative terms so that 

Pi + ■•' + Pki ~ <h -•••-?„< *i- 
Next, we take just enough further positive terms so that 

Pi + ' ' • + Pki ~ <h ?r, + Pk, + i + • • • + Pk 2 > y 2 , 

followed by just enough further negative terms to satisfy the inequality 

Pi + ■■■ +Pk t ~ <h + Pk l + 1 + ' • • 

+ Pk 2 - ?r, + l < *2- 

These steps are possible since Y.Pn and Yhin are both divergent series of positive 
terms. If the process is continued in this way, we obviously obtain a rearrangement 
of Yfln- We leave it to the reader to show that the partial sums of this rearrangement 
have limit superior y and limit inferior x. 


8.19 SUBSERIES 

Definition 8.34 - Let f be a function whose domain is Z + and whose range is an 
infinite subset of Z + , and assume that f is one-to-one on Z + . Let £a n and YJb n be 
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two series such that 

b n ^/(n)> if ft G Z . 

Then YJb n is said to be a subseries ofYfln- 

Theorem 8.35 . If J]a n converges absolutely , every subseries YJb n a ^ so converges 
absolutely. Moreover , we have 

00 

I > 

n= 1 

Proof. Given n, let N be the largest integer in the set {/(l), . . . ,/(«)}. Then 

n n N oo 

bk ^ l^fcl ^ \ a k\ ^ 

k= 1 fc=l fc=l *=1 

The inequality XJUi \b k \ < Zfc°=i 1^*1 implies absolute convergence of YJb n - 

Theorem 8.36. Let {/i,/ 2 , ...} be a countable collection of functions , eac/z defined 
on Z + , having the following properties: 

a) Each f n is one-to-one on Z + . 

b) 77ze range f n (7j+) is a subset Q n of Z + . 

c ) {0i j £? 2 > • •• } w 0 collection of disjoint sets whose union is Z + . 

Eef an absolutely convergent series and define 

b k {n) = a fk(n)9 if neZ + , ke Z + . 

Then: 

i) For each k 9 X*=i 6 fc (/i) w an absolutely convergent subseries ofJ^a n . 

ii) If s k = X*= i &*(«), f/ze •series' Xfc°°= i ^ converges absolutely and has the same 
sum as X*°= i 

Proof Theorem 8.35 implies (i). To prove (ii), let t k = \s t \ + - - • + \s k \. Then 

00 00 00 

** < E i*i(»)i + • • • + E iw»)i = E (i*i(»)i + • • • + m»)\) 

n = 1 n= 1 n= 1 

00 

= E O a /i(n)l + • • • + l a A(n)l)- 

n- 1 

But E n °°=i (|o/,(„)l + ••• + K k („)l) < Z „°°=1 This proves that has 

bounded partial sums and hence 2X converges absolutely. 

To find the sum of JX, we proceed as follows: Let e > 0 be given and choose 
N so that n > N implies 

oo n 

E ki - E ki < | • (is) 

k = 1 fc=l 2 


E IM * E l«nl 


n= 1 


n= 1 
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Choose enough functions f u . . . ,f r so that each term a it a 2> . . . , a N will appear 
somewhere in the sum 


00 


oo 


23 a fi{n) + ' ’ * + 23 a fr{n)' 


n = 1 n= 1 

The number r depends on N and hence on e. If n > r and n > N, we have 


5 1 + s 2 + ' * ■ + S n — a l 


k=l 


^ I^V+ll + \ a N + 2\ + ’* * < ~ 9 (19) 


because the terms a l9 a 2 , . . . , a N cancel in the subtraction. Now (18) implies 


00 


23 a k — 23 a \ 


k=l 


k= 1 


When this is combined with (19) we find 


oo 


Si + ■■• + s„- ]T a , 


k~ 1 


< 


if n > r, n > N. This completes the proof of (ii). 


8.20 DOUBLE SEQUENCES 

Definition 8.37. A function f whose domain is Z + x Z + is called a double sequence. 

note. We shall be interested only in real- or complex-valued double sequences. 

Definition 8.38. If a e C, we write lim p>4 _ „/(/>, q) = a and we say that the 
double sequence f converges to a, provided that the following condition is satisfied: 
For every e > 0, there exists an N such that \f(p, q) — a\ < e whenever both 
p > N and q > N. 

Theorem 8.39. Assume that \im p q _ K f(p, q) = a. For each fixed p, assume that 
the limit lim 4 _ x /(p, q) exists. Then the limit linip..^ (lim^*, f(p, q)) also exists 
and has the value a. 

note. To distinguish lim p^^fip, q) from lim p _ w (lim,_ x f(p, q j), the first is 
called a double limit, the second an iterated limit. 

Proof. Let F(p) = lim 4 _ x /(p, q). Given e > 0, choose N x so that 


I f(p, q) - a I < |, 


if P > N t and q > N t . 


For each p we can choose N 2 so that 


I f(p) -f(p, q)\ < |, 



if q > N 2 . 


( 21 ) 
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(Note that N 2 depends on p as well as on e.) For each p > N x choos e-N 2 , and then 
choose a fixed q greater than both N x and N 2 . Then both (20) and (21) hold and 
hence 

I F(p) - a\ < e, if p > N x . 

Therefore, lim^^ F{p) = a. 

note. A similar result holds if we interchange the roles of p and q. 

Thus the existence of the double limit lim p 9 ^ OT f(p, q) and of Hm,., „ /(/>, q) 
implies the existence of the iterated limit 

lim (lim ftp, q ) Y 

p~* ao \q~*ao J 

The following example shows that the converse is not true. 

Example. Let 


ftp, 4) = - 2 P<1 - 2 , (P = 1,2,..., q = 1, 2, . . . ). 
p + q 

Then lim q ^ x ftp, q) = 0 and hence limp^^ (lim ^^ftp, q)) = 0. But f(p, q) = i 
when p = q and f(p, q) = f when p = 2 q, and hence it is clear that the double limit 
cannot exist in this case. 

A suitable converse of Theorem 8.39 can be established by introducing the 
notion of uniform convergence. (This is done in the next chapter in Theorem 9.16.) 

Further examples illustrating the behavior of double sequences are given in 
Exercise 8.28. 


8.21 DOUBLE SERIES 

Definition 8.40. Let fbe a double sequence and let s be the double sequence defined 
by the equation 

s(P, q) = ± i ftm, n). 

m— 1 n— 1 

The pair (/, 5 ) is called a double series and is denoted by the symbol X m> „/(w, n) or, 
more briefly , by ^f(m, n). The double series is said to converge to the sum a if 

lim s(p, q) = a. 

p,q~* 00 

Each number ftm, n) is called a term of the double series and each s(p, q) is 
a partial sum. If Y, 1\m. n) has only positive terms, it is easy to show that it con- 
verges if, and only if, the set of partial sums is bounded. (See Exercise 8.29.) We 
say Y/(m, n) converges absolutely if X|/(m, n)\ converges. Theorem 8.18 is valid 
for double series. (See Exercise 8.29.) 
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8.22 REARRANGEMENT THEOREM FOR DOUBLE SERIES 

Definition 8.41. Let f be a double sequence and let g be a one-to-one function 
defined on Z + with range Z + x Z + . Let G be the sequence defined by 

Gin) = /[>(«)] if n e Z + . 

Then g is said to be an arrangement of the double sequence f into the sequence G. 

Theorem 8.42. Let £/(#?, ri) be a given double series and let g be an arrangement 
of the double sequence f into a sequence G. Then 

a) converges absolutely if and only if X/(w, n) converges absolutely. 
Assuming that X/(aw, ri) does converge absolutely , with sum S , we have further: 

b) E°- i G(n) = 5. 

c ) ifQn, n) and Zm=i/( w » n ) both converge absolutely. 

d) If A m = S"=i fin J, «) and B„ = £“=i fim, ri), both series '£A m and 
converge absolutely and both have sum S. That is , 

OO OO 00 00 

E E fim, n) T, fi m > n ) = S ■ 

m=l n — 1 n = 1 m = 1 

Proof Let T k = |G(1)| + • • • + | G(k)\ and let 

Sip, q) = Z) I fi m ’ n)\. 

m— 1 n = 1 

Then, for each k, there exists a pair ip, q) such that T k < S(p, q) and, conversely, 
for each pair ( p , q) there exists an integer r such that S(p, q) < T r . These in- 
equalities tell us that £|(/(«)| has bounded partial sums if, and only if, £| f(m, ri ) | 
has bounded partial sums. This proves (a). 

Now assume that "El fim, ri ) | converges. Before we prove (b), we will show that 
the sum of the series £G(«) is independent of the function g used to construct G 
from /. To see this, let h be another arrangement of the double sequence / into a 
sequence H. Then we have 

G{n) = /|X#!)] and H(n) = /[/*(«)]. 

But this means that Gin) = H[k{n)\, where k(n) = h ~ 1 \_g(nj\ . Since A: is a one- 
to-one mapping of Z + onto Z + , the series £•//(«) is a rearrangement of £(/(«), 
and hence has the same sum. Let us denote this common sum by S'. We will 
show later that S' — S. 

Now observe that each series in (c) is a subseries of £(/(«)• Hence (c) follows 
from (a). Applying Theorem 8.36, we conclude that £/l m converges absolutely 
and has sum S'. The same thing is true of It remains to prove that S' — S. 
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For this purpose let T = lim p S(p , q). Given e > 0, choose N so that 

0 < T — S{p , #) < a/2 whenever p > N and q > N. Now write 

** = E G (”)» «) = E »)• 

«=1 m = 1 n = 1 

Choose A/ so that includes all terms /(m, n) with 

1 < m < TV + 1, 1 < « < JV + 1. 

Then t M — s(N + 1, N + 1) is a sum of terms f(m, it) with either m > N or 
n > N. Therefore, if n > M, we have 


\t n - s(N + 1, N + 1)| < T — S(N + 1, N + 1) < - . 

Similarly, 

\S - s(N + 1, N + 1)| < T — S(JV + 1, N + 1) < - . 

2 


Thus, given e > 0, we can always find Af so that \t„ — S| < e whenever n > M. 
Since lim„_ 00 t„ = S', it follows that S' = S. 

note. The series y. ” = t Y.? L t f(m, n ) and Y™= t y”_ t f(m, it) are called “iterated 
series”. Convergence of both iterated series does not imply their equality. For 
example, suppose 


Then 


f(m, n) 



if m = n + 1, n = 1,2,..., 
if m = n — 1, n = 1, 2, . . . , 
otherwise. 


00 


00 


S n) = -1, 


m= 1 n= 1 


but 


E 


f(m, n) = 1. 


8.23 A SUFFICIENT CONDITION FOR EQUALITY OF ITERATED SERIES 

Theorem 8.43. Let fbea complex-valued double sequence. Assume that Y^L , f(m, n ) 
converges absolutely for each fixed m and that 

E E i f( m > «)i> 

m= 1 n= 1 

converges. Then: 

a) The double series y^. f(m . n) converges absolutely. 

b) The series YLj - , f(m, n) converges absolutely for each n. 
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c) Both iterated series t £*= i f(m, n ) and t £*= t /(m, «) converge 
absolutely and we have 

00 00 00 00 

X) X) /("*> «) = S /( m > ») = Z) /(»*, «)• 

m = 1 n =1 n = 1 m= 1 m,n 

Proof. Let # be an arrangement of the double sequence /into a sequence G. Then 
£G(«) is absolutely convergent since all the partial sums of £|G(n)| are bounded 
by Zm=i Z"=i I f{m, n)|. By Theorem 8.42(a), the double series E m>II /(/w, «) 
converges absolutely, and statements (b) and (c) also follow from Theorem 8.42. 

As an application of Theorem 8.43 we prove the following theorem concerning 
double series Y., . f(m. n) whose terms can be factored into a function of m times 
a function of n. 

Theorem 8.44. Let J^a m and Y.b n be two absolutely convergent series with sums 
A and B, respectively. Let f be the double sequence defined by the equation 

f(m, n) = ajb„, if (m, n)e Z + x Z + . 

Then Y ,, . f(m. n) converges absolutely and has the sum AB. 

Proof We have 

t l«J t W - t ( M t w) = t £ M l<-J. 

m= 1 i»=l m= 1 y i»=l J m= 1 n= 1 

Therefore, by Theorem 8.43, the double series X! m n a m b„ converges absolutely and 
has sum AB. 


8.24 MULTIPLICATION OF SERIES 

Given two series Yfl n and JA, we can always form the double series X!/(w, «), 
where /(/n, n) = a m b„. For every arrangement g of /into a sequence G, we are led 
to a further series £G(«). By analogy with finite sums, it seems natural to refer to 
£/(m, n) or to £<?(«) as the “product” of ][>„ and ]A> and Theorem 8.44 justifies 
this terminology when the two given series and YJ>n are absolutely convergent. 
However, if either or is conditionally convergent, we have no guarantee 
that either Y,f(m. n) or £<?(«) will converge. Moreover, if one of them does 
converge, its sum need not be AB. The convergence and the sum will depend on 
the arrangement g. Different choices of g may yield different values of the product. 
There is one very important case in which the terms f(m, n) are arranged “diag- 
onally” to produce £(/(n), and then parentheses are inserted by grouping together 
those terms a m b„ for which m + n has a fixed value. This product is called the 
Cauchy product and is defined as follows : 
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Definition 8.45 . Given two series a n an d Z*=o b n , define 

n 

C„ = E a k b„- k , if n = 0, 1, 2, . . . (22) 

k = 0 

The series Y”_n c„ is called the Cauchy product of^fln an d Hb„. 

note. The Cauchy product arises in a natural way when we multiply two power 
series. (See Exercise 8.33.) 

Because of Theorems 8.44 and 8.13, absolute convergence of both ^ Z a „ and 
YJb „ implies convergence of the Cauchy product to the value 



This equation may fail to hold if both £a„ and are conditionally convergent. 
(See Exercise 8.32.) However, we can prove that (23) is valid if at least one of 
is absolutely convergent. 


Theorem 8.46(Mertens). Assume that Y,n=o a n converges absolutely and has sum 
A , and suppose ^°=o b n converges with sum B. Then the Cauchy product of these 
two series converges and has sum AB . 


Proof. Define A„ = ££ =0 a k , B n = £" =0 b k , C„ = Jf k=0 c k , where c k is given by 
(22). Let d n = B — B„ and e„ = 2Z=o a A - k - Then 


where 


n 


c p = E E a kK~k = E E/«( fc )> 


n = 0 k = 0 


n = 0 k = 0 


(24) 


Then (24) becomes 



^kbft — k9 

0, 


if n > k, 
if n < k. 



= E T,fn(k) = a k b H ., 


k = 0 n = 0 


k = 0 n=k 


ip — n bm 2* n 


m 


p 


E 

v — n 


k 


= E a k( B - dp-k) = ApB 

k = 0 



To complete the proof, it suffices to show that e p -* 0 as p -» oo. The sequence 
{d„} converges to 0, since B = Choose M > 0 so that \d n \ < M for all n, 
and let K = Yf-n \a„\. Given e > 0, choose N so that n > TV implies \d n \ < e/(2K) 
and also so that 


E 


n = N+ 1 


a < 

n ^ 


e 

2 M ' 
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Then, for p > IN , we can write 


P N p 

W A i i S •- x. 


e P \ ^ \ a kd P -k\ + S \ a kd p -k\ ^ ~ S \ a k\ + M ^2 \d t 

k-0 k-N+l 2K k=0 k=N + 1 


k-N+ 1 


k = N+ 1 


^ ^ £ \ a k\ + Af 2 l fl *l < ^ ^ = e - 

2K k=o k=N+i 2 2 


This proves that e p -► 0 as p -► oo, and hence C p -► as p -► oo. 

A related theorem (due to Abel), in which no absolute convergence is assumed, 
will be proved in the next chapter. (See Theorem 9.32.) 

Another product, known as the Dirichlet product , is of particular importance 
in the Theory of Numbers. We take a 0 = b 0 = 0 and, instead of defining c n by 
(22), we use the formula 

^ i Qdbfijdi 1 j 2, . . . ), (25) 

d\n 

where Y.d\n means a sum extended over all positive divisors of n (including 1 and 
n ). For example, c 6 = a t b 6 + a 2 b 3 + a 3 b 2 + a 6 b u and c 7 = afb n + a 1 b l . 
The analog of Mertens’ theorem holds also for this product. The Dirichlet product 
arises in a natural way when we multiply Dirichlet series. (See Exercise 8.34.) 


8.25 CESARO SUMMABILITY 

Definition 8.47. Let s n denote the nth partial sum of the series JX> and let {<r„} be 
the sequence of arithmetic means defined by 


<*n = 


Sj +••• + $„ 


if n = 1, 2, . . . 


(26) 


The series IX is said to be Cesar o summable (or (C, 1) summable) if {cr„} converges. 
If \im n ^ x <r n = S, then S is called the Cesar o sum (or (C, 1) sum) of £a n , and we 


write 


IX = 5 (C, 1). 


Example 1. Let a„ — z", \z\ - 1, z ^ 1. Then 


s„ = 


1 z" . 1 

and a. = 

1 - z 1 - z 1 - z 


1 z(l - z") 
n (1 - z) 2 


Therefore, 


In particular, 


E 2 -‘ - rb 

n=l x t 

Z(-i)"- 1 = i 


(c, i). 


(c, 1). 
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Example 2. Let a n = In this case, 

lim sup o n = lim inf o n = 0, 

n-+oo n— *oo 

and hence £( — 1)" +1 « is not (C, 1) summable. 

Theorem 8.48. If a series is convergent with sum S, then it is also (C, 1) summable 
with Cesaro sum S. 

Proof. Let s n denote the wth partial sum of the series, define o n by (26), and 
introduce t„ — s„ — S, r n = o n — S. Then we have 


_h +••• + *„ 


n 


n 



and we must prove that t „ -» 0 as n -* oo. Choose A > 0 so that each |?„| < A. 
Given e > 0, choose N so that n > N implies |/J < e. Taking n > N in (27), 
we obtain 

I- I ^ Ull + ' ' ' + UnI , UiV+ll + ‘ ' * + Uni „ NA 
\t„\ < 1 < h e. 

n n n 


Hence, lim sup n _ Q0 |r„| < e. Since s is arbitrary, it follows that lim,,.,,*, |t„| = 0. 

note. We have really proved that if a sequence {«„} converges, then the sequence 
{<x B } of arithmetic means also converges and, in fact, to the same limit. 

Cesaro summability is just one of a large class of “summability methods” 
which can be used to assign a “sum” to an infinite series. Theorem 8.48 and 
Example 1 (following Definition 8.47) show us that Cesaro’s method has a wider 
scope than ordinary convergence. The theory of summability methods is an 
important and fascinating subject, but one which we cannot enter into here. For 
an excellent account of the subject the reader is referred to Hardy’s Divergent 
Series (Reference 8.1). We shall see later that (C, 1) summability plays an impor- 
tant role in the theory of Fourier series. (See Theorem 11.15.) 


8.26 INFINITE PRODUCTS 

This section gives a brief introduction to the theory of infinite products. 

Definition 8.49. Given a sequence {w B } of real or complex numbers, let 

n 

Pi = Ml, Pi = »1»2. Pn = “l»2 "•»,,= II «*• (28) 

k= 1 

The ordered pair of sequences ({w„}> {aJ) is called an infinite product (or simply , 
a product). The number p n is called the «th partial product and u n is called the «th 
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factor of the product. The following symbols are used to denote the product defined 
by (28); 

oo 

2 n (29) 

n — i 

note . The symbol JJ ^= N+1 u n means n*=i u N + n • We also write Y[ u n when there 
is no danger of misunderstanding. 

By analogy with infinite series, it would seem natural to call the product (29) 
convergent if {/?„} converges. However, this definition would be inconvenient 
since every product having one factor equal to zero would converge, regardless of 
the behavior of the remaining factors. The following definition turns out to be 
more useful : 

Definition 8 JO . Given an infinite product n^°=i let p n = IIJUi u k- 

a ) If infinitely many factors u n are zero , we say the product diverges to zero. 

b) If no factor u n is zero , we say the product converges if there exists a number 
P t* 0 such that {p n } converges to p. In this case , p is called the value of the 
product and we write p = Yln= x u n . If{p n } converges to zero , we say the product 
diverges to zero. 

c) If there exists an N such that n > N implies u n 0, we say i u n converges , 
provided that n*=jv+i u n converges as described in (b). In this case , the value 
of the product t u n is 

00 

«1«2 • • • Un II u n- 

n = N+l 

d) J7”= i u n i s called divergent if it does not converge as described in (b) or (c). 

Note that the value of a convergent infinite product can be zero. But this happens 
if, and only if, a finite number of factors are zero. The convergence of an infinite 
product is not affected by inserting or removing a finite number of factors, zero or 
not. It is this fact which makes Definition 8.50 very convenient. 

Example. (1 + l/«) and IJ” =2 (1 — 1 /«) are both divergent. In the first case, 

p„ = n + 1, and in the second case, p„ = 1 In. 

Theorem 8.51 (Cauchy condition for products). The infinite product ] con- 
verges if, and only if, for every e > 0 there exists an N such that n > N implies 

l«n+ i m «+2 ••• «n+* - 1| < 6, for k = 1,2,3,... (30) 

Proof. Assume that the product l\u n converges. We can assume that no u n is 
zero (discarding a few terms if necessary). Let p„ = m, • • • u n and p = lim,,^ p n . 
Then p # 0 and hence there exists an M > 0 such that \p n \ > M. Now {/>„} 
satisfies the Cauchy condition for sequences. Hence, given e > 0, there is an N 
such that n > -N implies \p n+k - p n \ < eM for k = 1,2,... Dividing by \p n \, 
we obtain (30). 
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Th. 8.52 


Now assume that condition (30) holds. Then n > N implies u n ^ 0. [Why?] 
Take e = \ in (30), let N 0 be the corresponding N, and let q„ = u No+ iU Na+2 • • • u„ 
if n > N 0 . Then (30) implies | < \q n \ < Therefore, if {<?„} converges, it cannot 
converge to zero. To show that {q n } does converge, let e > 0 be arbitrary and 
write (30) as follows : 


ffn + lfc 


< e. 


This gives us | q„ +k — q„\ < e\q„\ < fe. Therefore, {q n } satisfies the Cauchy 
condition for sequences and hence is convergent. This meaps that the product 
IK converges. 

note. Taking k = 1 in (30), we find that convergence of J \u n implies 
lim,,^ u n = 1 . For this reason, the factors of a product are written as u n = 1 + a„. 
Thus convergence of JJ(1 4- a„) implies Iim,,.,^ a„ = 0. 


Theorem 8.52. Assume that each a„ > 0. Then the product [](1 + a n ) converges 
if, and only if, the series converges. 

Proof Part of the proof is based on the following inequality: 

1 + jc < e x . (31) 

Although (31) holds for all real x, we need it only for jc > 0. When x > 0, (31) 
is a simple consequence of the Mean-Value Theorem, which gives us 

e x — 1 = xe X0 , where 0 < x 0 < x. 

Since e x ° > 1, (31) follows at once from this equation. 

Now let s„ — Oj + a 2 + • • • + a„,p n = (1 + Oi)(l + a 2 ) ■ • • (1 + a„ ). Both 
sequences {i„} and {/>„} are increasing, and hence to prove the theorem we need 
only show that {s„} is bounded if, and only if, {/>„} is bounded. 

First, the inequality p n > s„ is obvious. Next, taking x = a k in (31), where 
k = 1 , 2 ,...,/?, and multiplying, we find p n < e Sn . Hence, {j n } is bounded if, 
and only if, {p„} is bounded. Note that {/>„} cannot converge to zero since each 
p„ > 1 . Note also that 

p n -> + oo if s„ -* + oo. 

Definition 8.53. The product [](1 + a„) is said to converge absolutely ?/]Q(l + | a n \) 
converges. 


Theorem 8.54. Absolute convergence of JJ(1 + a n ) implies convergence. 
Proof Use the Cauchy condition along with the inequality 

|(1 + <Wi)(l + a n+2 ) • • • (1 + a „ +k ) - 1| 

^ (1 + l fl M+ll)(l + l fl «+2l) - '‘(l + l°B + kl) — 1* 
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note. Theorem 8.52 tells us that HO + a n) converges absolutely if, and only if, 
Yfln converges absolutely. In Exercise 8.43 we give an example in which JJ(1 + a n ) 
converges but Yfl n diverges. 

A result analogous to Theorem 8.52 is the following: 

Theorem 8.55. Assume that each a n > 0. Then the product J}(1 - a„) converges 
if \ and only if, the series JX converges. 

Proof Convergence of 2X implies absolute convergence (and hence convergence) 
of IK 1 “ a n)- 

To prove the converse, assume that 2X diverges. If {a n } does not converge to 
zero, then JJ(1 — a n ) also diverges. Therefore we can assume that a n -*■ 0 as 
n -*■ oo. Discarding a few terms if necessary, we can assume that each a n < 
Then each factor 1 — ^ (and hence # 0). Let 

Pn = (1 - Oi)(l - a 2 ) • • • (1 - a n ), q n = (1 + o^l + a 2 ) ■ • • (1 + a n ). 
Since we have 


(1 - a*)(l + a k ) = 1 - a\ < 1, 

we can write p n < l/q n . But in the proof of Theorem 8.52 we observed that 
9n~* + °° if H a n diverges. Therefore, p n -*■ 0 as n -* oo and, by part (b) of 
Definition 8.50, it follows that — o„) diverges to 0. 


8.27 EULER’S PRODUCT FOR THE RIEMANN ZETA FUNCTION 

We conclude this chapter with a theorem of Euler which expresses the Riemann 
zeta function £(s) — n~ s as an infinite product extended over all primes. 

Theorem 8.56. Let p k denote the fcth prime number. Then if s > 1 we have 



The product converges absolutely. 


Proof. We consider the partial product P m = n?=i (1 - p k s ) 1 and show that 
P m C( s ) as m -*■ co. Writing each factor as a geometric series we have 





+ 


1 


_2s 

Pk 


+ 


9 


a product of a finite number of absolutely convergent series. When we multiply 
these series together and rearrange the terms according to increasing denominators, 
we get another absolutely convergent series, a typical term of which is 


1 1 






where n = p“ l ■ • • p“ m , 


• • • 
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and each a t > 0. Therefore we have 




where is summed over those n having all their prime factors <p m - By the 
unique factorization theorem (Theorem 1.9), each such n occurs once and only 
once in Subtracting P m from £($) we get 



m 


-t^-E^-E 


11=1 n 


i n 



where £ 2 is summed over those n having at least one prime factor > p m . Since these 
n occur among the integers >p m , we have 

m -pj< E • 

n>p m fl 


As m -> oo the last sum tends to 0 because Y. n ~ s converges, so P m -> £(s). 

To prove that the product converges absolutely we use Theorem 8.52. The 
product has the form JJ(1 + a k ), where 


a 


l 

_s 

Pk 


1 


fc = — + Z~ + 

K s «,2 s 


Pk 


The series Y* a k converges absolutely since it is dominated by Y, n s - Therefore 
nd + 0*) also converges absolutely. 


EXERCISES 


Sequences 

8.1 a) Given a real-valued sequence {a n } bounded above, let u n = sup {a k : k > n}. 

Then and hence U = lim^oo u n is either finite or — oo. Prove that 

U = lim sup a n = lim (sup {a k : k > n}). 

n-*do n-+ oo 

b) Similarly, if {a„} is bounded below, prove that 

V = lim inf a n = lim (inf {a : k > n}). 

n~* oo n-*oo 

If U and V are finite, show that: 

c) There exists a subsequence of {a n } which converges to U and a subsequence 
which converges to V. 

d) If U — V , every subsequence of {a n } converges to U. 

8.2 Given two real-valued sequences {a n } and {b n } bounded below. Prove that 
a) lim sup w _oo (a n + b n ) < lim sup^oo a n + lim sup n _oo b n . 
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b) lim sup,,.,^ (aj) n ) < (lim sup^^ a„)(lim sup b n ) if a n > 0, b n > 0 for all n , 
and if both lim sup,,^ a n and lim sup,,.**, b n are finite or both are infinite. 

8.3 Prove Theorems 8.3 and 8.4. 

8.4 If each a n > 0, prove that 


lim inf < lim inf yj a n < lim sup \! a n < lim sup ^±1 . 

n-+ oo Q n n-* oo n-+ oo n-+oo Q n 

8.5 Let a n = n n jn\. Show that lim,,.**, a„+Ja n = e and use Exercise 8.4 to deduce that 


lim — n ~ T = e. 

n-+ oo («!) 1/W 


8.6 Let {fl„} be a real-valued sequence and let a n = {a x + • • • + a n )/n . Show that 

lim inf a n < lim inf a n < lim sup a n < lim sup a n . 


n-* oo 


n -* oo 


n-+oo 


n — * oo 


8.7 Find lim sup*.,*, a n 

and lim inf„_» x , a„ if a„ is given by 



a) cos n , 

b) ( 1 + — | cos nn. 

c) n sin 

nn 

* 


\ «/ 

3 

. nn nn 

d) sin — cos — , 

e) (-!)"»/( 1 + «)", 

f) - - 

\~] 

2 2 

3 

Ul 


note. In (f), [at] denotes the greatest integer <x. 

8.8 Let a n = 2y/n — XZ=i l/V^- Prove that the sequence {«„} converges to a limit p 
in the interval 1 < p < 2. 


In each of Exercises 8.9 through 8.14, show that the real-valued sequence {a n } is con- 
vergent. The given conditions are assumed to hold for all n > 1. In Exercises 8.10 
through 8.14, show that {a n } has the limit L indicated. 


8.9 


a 


n 


8.12 a i 


8 . 


< 2, 

\ a n + 2 ~ 

^n+1 — il^n+1 

> 0, 

a 2 > 0, 

^w + 2 = ( a n a n+ 1 

= 2, 

a 2 = 8, 

a 2n+l = K^2n 

II 

1 

3# n + 1 

== 2 + tfjj, L = 

II 

** 

a n+ 1 = 

3(1 + a„) r _ 

- 9 ^ — 

3 + „ 


= 4. 


a 


2n+ 1 


8.14 a n = ^5±! , where b t = b 2 = 1, b n+2 = b n + b„ +1 , L = - + . 

K 2 

Hint. Show that b n+2 b n - b 2 +l = (-1)" +1 and deduce that \a n - a„ +1 | < n~ 2 , if 
n > 4. 
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Series 


8.15 Test for convergence ( p and q denote fixed real numbers). 


00 


oo 


a) Y, n z e~ n . 


n= 1 

oo 


b) Y ") P > 


c) Y p " nV > 


n = 2 

oo j 


n=l 

oo 


(o < ? < p\ 

n — z 


oo 


e) Y n * 1/W ’ 


11=1 

00 


0 5/>"- 


(0 < q < p). 


00 


” lQ g (* + V 1 *) ’ 
1 




^ (log n) 


logn > 


i) V -- 

^ n log n 


n = 3 

oo 


log n (log log «) 


V 9 


k ) Y (^1 + " 2 ~ 


n=l 

oo 


oo / j \ log log n 

S \l°g log »/ 

■> £ - fc - t ) ■ 


00 


m) £ (V« - !)"> 


n=l 


n) ^ n p (y/ n + 1 — 2>/« + Vai — 1). 


n=l 


8.16 Let 5 = {«i, « 2 , • • . } denote the collection of those positive integers that do not 
involve the digit 0 in their decimal representation. (For example, 7 e S but 101 $ S .) 
Show that X£Li l/n k converges and has a sum less than 90. 

8.17 Given integers a u a 2 , . . . such that 1 < a n < n — 1, n = 2, 3, . . . Show that the 
sum of the series XJj'L x ajn ! is rational if, and only if, there exists an integer N such that 
a n — n — 1 for all n > N. Hint. For sufficiency, show that ( n — is a tele- 
scoping series with sum 1. 

8.18 Let p and q be fixed integers, p > q > 1, and let 

A I x M-D * +1 

X n Zj k ’ Sn k 

k = qn + 1 K k= 1 K 

a) Use formula (8) to prove that lim n _ +oc = log ( p/q ). 

b) When q = 1, p = 2, show that = *n and deduce that 

oo / i \ n+ 1 

X LJ2 — = log2. 

w Yi 

n= l 1 

c) Rearrange the series in (b), writing alternately p positive terms followed by q 
negative terms and use (a) to show that this rearrangement has sum 

log 2 + ± log (p/q). 

d) Find the sum of X^°= i ( — l) w-h 1 (1/(3ai - 2 ) - 1/(3a i - 1 )). 

8.19 Letc n = -a n + ib n , where a n = (— 1)"/Vai, b n — \/n 2 . Show that is conditionally 
convergent. 
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8.20 Use Theorem 8.23 to derive the following formulas: 


a) 2^ ~T~ = ^ lo 8 n + A + 


k= 1 


°{~) 


(>4 is constant). 


b>E 


1 


k log k 


= log (log n) + 2? + 


k = 2 


°(—r — ) 

\/i log n) 


( B is constant). 


8.21 If 0 < a < 1, s > 1, define C(y, a) = £T =0 (/i + a)~ s . 


a) Show that this series converges absolutely for s > 1 and prove that 

g if* = 1,2,..., 

where ((s) = £(s, 1) is the Riemann zeta function. 

b) Prove that (-l)"'" 1 /* 5 = (1 - l 1 - 5 )^) if * > 1. 

8.22 Given a convergent series Yflm where each a n > 0. Prove that Y.^ a n n ~ p converges 
if P > i- Give a counterexample for p = 

8.23 Given that 2A diverges. Prove that also diverges. 

8.24 Given that converges, where each a n > 0. Prove that 

ZKa«+i ) 1/2 


also converges. Show that the converse is also true if {a n } is monotonic. 

8.25 Given that 2X converges absolutely. Show that each of the following series also 
converges absolutely: 

a ) X a ” b ) X T 1 " „ < if no a » = - !). 

l -r u n 



a 


2 

n 


1 + a 


2 ’ 
n 


8.26 Determine all real values of x for which the following series converges : 



1 1 

1 +- + •••+- 
2 n 


) 


sin nx 
n 


8.27 Prove the following statements : 

a) 2>A converges ifl>« converges and if JJb n - b n+1 ) converges absolutely. 

b) Y* a nb n converges if J^a n has bounded partial sums and if (b n - b n+1 ) converges 
absolutely, provided that b n -> 0 as n -> oo. 


Double sequences and double series 

8.28 Investigate the existence of the two iterated limits and the double limit of the double 
sequence / defined by 

a) f(p, q) = — — , b) f(p, q) = — , 

P + q P + q 
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C) f{p, q ) = 
e) f(p, q) - 
g) fiP, q) = 


(-1 ) P P 
P + q 

(-l) p 

> 

q 

cos p 

> 

q 


d) /(/>, q) = 
0 /(/>, q) = 
h) f(p , ?) = 


(-l) p+ « 



(-l) p+€ . 



Answer. Double limit exists in (a), (d), (e), (g). Both iterated limits exist in (a), (b), (h). 
Only one iterated limit exists in (c), (e). Neither iterated limit exists in (d), (f). 

8.29 Prove the following statements : 

a) A double series of positive terms converges if, and only if, the set of partial sums 
is bounded. 

b) A double series converges if it converges absolutely. 

c ) Zm,^” (m2+ " 2) converges. 

8.30 Assume that the double series n a(ri)x mn converges absolutely for \x\ < 1. Call 
its sum S(x). Show that each of the following series also converges absolutely for |jc| < 1 
and has sum S(jc) : 


0° 00 

Z a{n) rrrr» > Z A(n) x n , where A(n) = ^ a(d). 

«=1 1 " X n=l d\n 

8.31 If a is real, show that the double series Xm,n ( m + i«)~ a converges absolutely if, 
and only if, a > 2. Hint . Let s(p , q) = £ p = 1 S«=i \ m + w|“*. The set 

{m + in : m = 1, 2, . . . , p, n = 1, 2, . . . , p} 

consists of p 2 complex numbers of which one has absolute value V2, three satisfy 

|1 + 2/ 1 < \m + in\ < 2 V 2 , five satisfy |1 + 3/| < \m + in\ < 3\Il, etc. Verify this 
geometrically and deduce the inequality 


2n — 1 


2-' ! £ ~ s f) s £ 

n — j * ^ |f = 1 


2/1—1 


Or + 1) 


a/2 * 


8.32 a) Show that the Cauchy product of (— l) n+ 1 /V« 4- 1 with itself is a divergent 
series. 

b) Show that the Cauchy product of (— l) n+1 /(« 4- 1) with itself is the series 

00 / iyi +1 y j 

+ ~ 


* (-i)" +1 / 1 i\ 

2 V - — ~~ |i + - + ■•■ + -. 

Sfl » + 1 \ 2 »/ 


Does this converge? Why? 

8.33 Given two absolutely convergent power series, say X£°=o a„x n and X^°=o having 
sums A(x) and /?(*), respectively, show that X£°=o = A(x)B(x) where 


Qk.bn — 1 


k = 0 
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8.34 A series of the form l a n /n s is called a Dirichlet series . Given two absolutely 
convergent Dirichlet series, say ajrf and X”=i t>Jn s , having sums A(s) and B(s), 
respectively, show that X® =1 cjn s = A(s)B(s) where c n = X*,* *A/d- 

8.35 If £(s) = X*=i V" 5 , -y > 1, show that C 2 (s) = X*=i d(n)/n s , where d(n) is the 
number of positive divisors of n (including 1 and n). 


Cesaro summability 


8.36 Show that each of the following series has (C, 1) sum 0: 

a) 1 — 1 — 1 + 1 + 1 — 1 — 1 + 1 + 1 — — + + ••*. 

b) i — 1 + i + i — 1 + i + i“ 1 + H — • • ■ . 

c) cos x + cos 3x + cos 5x + • ■ - (x real, x & mn). 


8.37 Given a series let 


Prove that 


’ n 


±a k . 


k=l 


n 

{ n = 23 k ° k ’ 

k= 1 


On 


1 

ft 


n 

S k . 

k- 1 



a) /„ = («+ IK - na n . 


b) If Yfln is (C, 1) summable, then X^n converges if, and only if, t n = o(n) as n -> oo. 

c ) H a n is (Q 1) summable if, and only if, X*=i t n ln(n + 1) converges. 

8.38 Given a monotonic sequence {a n } of positive terms, such that lim n _, 00 a n = 0. Let 


n 


n 


’n 


= ^2 a k, «» = L v « = L 


k=l 


fc=l 


k=l 


Prove that: 

a) v n = \u n + (-l) n sJ2. 

b) X*=i (- 1)K is (C, 1) summable and has Cesaro sum iX”=i ( - 1 ) n a n . 

c ) ( — 1)”(1 +■£ + ■•■+ l/«) = —log V2 (C, 1). 


Infinite products 

8.39 Determine whether or not the following infinite products converge. Find the value 
of each convergent product. 

* n (■-*+,). » n 0 - »->. 

d)f[(i + zn if|z|<i. 

n=2 n 1 n=0 

8.40 If each partial sum s„ of the convergent series £a„ is not zero and if the sum itself 
is not zero, show that the infinite product a x JJ£L 2 (1 + a J s n- 1 ) converges and has the 
value a n . 
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8.41 Find the values of the following products by establishing the following identities and 
summing the series: 






1 

n(n 4- 1) ’ 


8.42 Determine all real x for which the product cos (x/2 n ) converges and find the 
value of the product when it does converge. 

8.43 a) Let a n = (-1)7 \fn for n — 1, 2, . . . Show that J|(l + a n ) diverges but that 

^ja n converges. 

b) Leta 2n _i = — \l*Jn,a 2n = 1 /yfn 4- 1/wforw = 1,2,... Show that J|(l + a n ) 
converges but that diverges. 


8.44 Assume that a n > 0 for each n = 1,2,... Assume further that 


^2n+2 ^ &2n+l ^ Tl 1, 2, . . . 

1 + a 2n 

Show that n?-i 0 + (- l) k a k ) converges if, and only if, (- l) k a k converges. 

8.45 A complex-valued sequence {/(«)} is called multiplicative if/(l) = 1 and if f(mn) = 
f(m)f(n) whenever m and n are relatively prime. (See Section 1.7.) It is called com- 
pletely multiplicative if 


/( 1) = 1 and f(mn) = f(m)f(n) for all m and n. 


a) If {/(«)} is multiplicative and if the series X f(n ) converges absolutely, prove that 

00 00 

2 fin) = U {1 + f(p k ) + f(Pk ) + •••}, 

n = 1 fc= 1 


where p k denotes the A:th prime, the product being absolutely convergent. 


b) If, in addition, {/(«)} is completely multiplicative, prove that the formula in (a) 
becomes 


00 


00 


i>>-n 

n — 1 L — 1 


1 

1 -f(Pk)' 


Note that Euler’s product for C(s) (Theorem 8.56) is the special case in which 
fin) = /T*. 

8.46 This exercise outlines a simple proof of the formula f(2) = 7r 2 /6. Start with the 
inequality sin x < x < tan x , valid for 0 < x < nil , take reciprocals, and square each 
member to obtain 

cot 2 x < — - < 1 4* cot 2 x. 


Now put x = kn/(2m + 1), where k and m are integers, with 1 < k < m, and sum on k 
to obtain 


m 

£ 

k=l 


COt' 


kn 


(2m + 1 ) : 


m 


1 


m 


2m + 1 


n 


fc= i K 


< m 


+ ^ cot : 


kn 


k=l 


2m + 1 * 
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Use the formula of Exercise 1.49(c) to deduce the inequality 

m{2m — \)n 2 i 1 2 m(m + l)n 2 
3(2 m + l) 2 < k 2 < 3(2w + l) 2 

Now let m -► oo to obtain C(2) = 7r 2 /6. 

8.47 Use an argument similar to that outlined in Exercise 8.46 to prove that f(4) = 7r 4 /90. 
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CHAPTER 9 


SEQUENCES 
OF FUNCTIONS 


9.1 POINTWISE CONVERGENCE OF SEQUENCES OF FUNCTIONS 

This chapter deals with sequences {/,} whose terms are real- or complex-valued 
functions having a common domain on the real line R or in the complex plane C. 
For each x in the domain we can form another sequence {/„(*)} whose terms are 
the corresponding function values. Let S denote the set of x for which this second 
sequence converges. The function / defined by the Equation 

fix) - lim f„(x), if xe S, 

rt-+ oo 

is called the limit function of the sequence {/„}, and we say that {/„} converges 
pointwise to / on the set S. 

Our chief interest in this chapter is the following type of question: If each 
function of a sequence {f n } has a certain property, such as continuity, differen- 
tiability, or integrability, to what extent is this property transferred to the limit 
function? For example, if each function f n is continuous at c, is the limit function 
/ also continuous at cl We shall see that, in general, it is not. In fact, we shall 
find that pointwise convergence is usually not strong enough to transfer any of the 
properties mentioned above from the individual terms f n to the limit function /. 
Therefore we are led to study stronger methods of convergence that do preserve 
these properties. The most important of these is the notion of uniform convergence. 

Before we introduce uniform convergence, let us formulate one of our basic 
questions in another way. When we ask whether continuity of each f„ at c implies 
continuity of the limit function /at c, we are really asking whether the equation 


lim f„(x) = f n (c), 

x~*c 

implies the equation 

lim f(x) = f(c). 

x—*c 

But (1) can also be written as follows: 



lim lim f„(x) = lim lim f„(x). (2) 

x~*c n~* oo n~* oc x-*c 

Therefore our question about continuity amounts to this : Can we interchange the 
limit symbols in (2)? We shall see that, in general, we cannot. First of all, the 
limit in (1) may not exist. Secondly, even if it does exist, it need not be equal to 


218 



Sequences of Real- Valued Functions 


219 


f{c). We encountered a similar situation in Chapter 8 in connection with iterated 
series when we found that Zm= 1 £®= 1 Am, n) is not necessarily equal to 
£”= i Em= i ft™, n). 

The general question of whether we can reverse the order of two limit pro- 
cesses arises again and again in mathematical analysis. We shall find that uniform 
convergence is a far-reaching sufficient condition for the validity of interc han g in g 
certain limits, but it does not provide the complete answer to the question. We 
shall encounter examples in which the order of two limits can be interchanged 
although the sequence is not uniformly convergent. 


9.2 EXAMPLES OF SEQUENCES OF REAL-VALUED FUNCTIONS 

The following examples illustrate some of the possibilities that might arise when 
we form the limit function of a sequence of real-valued functions. 



/ (x) = lim f n (x) . 

n — >oc 


Figure 9.1 


Example 1. A sequence of continuous functions with a discontinuous limit function. Let 
f„(x) = x 2 "/(l + x 2n ) if x e R, n = 1,2,... The graphs of a few terms are shown in 
Fig. 9.1. In thiscaselim„_ 00 f„(x) exists for every real x, and the limit function /is given by 

(0 if |jc| < 1, 

fix) = h if |xj = 1 , 

[l if jjcj > 1. 

Each/, is continuous on R, but /is discontinuous at x = 1 and x = — 1. 


Example 2. A sequence of functions for which lim n _ ^ Jo f„(x) dx # Jo lim,,^//*) dx. Let 
f n (x) = n 2 x(l - x) n if x e R, n = 1, 2, . . . If 0 < x < 1 the limit fix) = lim n ^ x f n (x) 
exists and equals 0. (See Fig. 9.2.) Hence So fi x ) dx = 0. But 





x(l — x)" dx 
(1 - t)t” dt -- 


n + 1 


n + 2 in + l)(n + 2) 
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so lim,,.,^ Jo f„(x) dx = 1. In other words, the limit of the integrals is not equal to the 
integral of the limit function. Therefore the operations of “limit” and “integration” 
cannot always be interchanged. 

Example 3. A sequence of differentiable functions {f n } with limit 0 for which {/'} diverges. 
Let f„(x) = (sin «x)/V n if x e R, n = 1,2,... Then lim,,.,*, f n (x) = 0 for every *. But 
f' n (x) = V« cos nx, so limn^oo/^x) does not exist for any *. (See Fig. 9.3.) 



9.3 DEFINITION OF UNIFORM CONVERGENCE 

Let {f„} be a sequence of functions which converges pointwise on a set S to a 
limit function /. This means that for each point x in S and for each e > 0, there 
exists an N (depending on both x and e) such that 

n > N implies !/,(*) — /(x)| < s. 
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If the same N works equally well for every point in S, the convergence is said to be 
uniform on S. That is, we have 

Definition 9.1 . A sequence of functions {/„} is said to converge uniformly to f on a 
set S if for every £ > 0, there exists an N {depending only on e) such that n > N 
implies 

I /„(*) - f{x) | < £, for every x in S. 

We denote this symbolically by writing 

f„-+f uniformly on S. 

When each term of the sequence {f n } is real-valued, there is a useful geometric 
interpretation of uniform convergence. The inequality \f n {x) — f{x)\ < £ is then 
equivalent to the two inequalities 

f{x) - E < f„(x) < f(x) + E. (3) 

If (3) is to hold for all n > N and for all x in S, this means that the entire graph 
of /„ (that is, the set {(jc, y): y = f„(x), x e S}) lies within a “band” of height 2e 
situated symmetrically about the graph of f (See Fig. 9.4.) 


'll t{ 'H*\ _i_ a 



Figure 9.4 


A sequence {/„} is said to be uniformly bounded on S if there exists a constant 
M > 0 such that |_/j,(jc)| < M for all x in S and all n. The number M is called a 
uniform bound for {f„}. If each individual function is bounded and if f„ -* f 
uniformly on S, then it is easy to prove that {/„} is uniformly bounded on S. (See 
Exercise 9.1.) This observation often enables us to conclude that a sequence is 
not uniformly convergent. For instance, a glance at Fig. 9.2 tells us at once that 
the sequence of Example 2 cannot converge uniformly on any subset containing a 
neighborhood of the origin. However, the convergence in this example is uniform 
on every compact subinterval not containing the origin. 


9.4 UNIFORM CONVERGENCE AND CONTINUITY 

Theorem 9.2. Assume that f n ->f uniformly on S. If each f„ is continuous at a 
point c of S, then the limit function f is also continuous at c. 

note. If c is an accumulation point of S, the conclusion implies that 

lim lim f„(x) = lim lim /„(*). 
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Proof \ If c is an isolated point of S, then / is automatically continuous at c. 
Suppose, then, that c is an accumulation point of S. By hypothesis, for every 
e > 0 there is an M such that n > M implies 

|/„(x) — /(x)| < - for every x in S. 

Since f M is continuous at c, there is a neighborhood B(c) such that x e B(c ) n S 
implies 

I /«(*) - /m(c) I < | • 

But 

I Ax) - Ac) I < |/(x) - f M (x) I + I f M (x) - f M (c) I + \f M (c) - Ac) |. 

If x e B(c) n S, each term on the right is less than e/3 and hence |/(x) — /(c) | < s. 
This proves the theorem. 

note. Uniform convergence of {/,} is sufficient but not necessary to transmit 
continuity from the individual terms to the limit function. In Example 2 (Section 
9.2), we have a nonuniformly convergent sequence of continuous functions with 
a continuous limit function. 


9.5 THE CAUCHY CONDITION FOR UNIFORM CONVERGENCE 

Theorem 9.3. Let {/,} be a sequence of functions defined on a set S. There exists a 
function f such that f n -*f uniformly on S if, and only if, the following condition 
( called the Cauchy condition) is satisfied: For every e > 0 there exists an N such 
that m > N and n > N implies 

I fmix) - /„(x)| < e, for every x in S. 

Proof Assume that/ -♦ f uniformly on S. Then, given e > 0, we can find N so 
that n > N implies \f„(x) — /(x)| < e/2 for all x in S. Taking m > N, we also 
have |/„(x) — fix) | < e/2, and hence |/ m (x) — f n (x)\ < e for every x in S. 

Conversely, suppose the Cauchy condition is satisfied. Then, for each x in S, 
the sequence {/„(x)} converges. Let /(x) = lim B _ 00 /„(x) if x e S. We must show 
that f n ->f uniformly on S. If e > 0 is given, we can choose N so that n > N 
implies |/ n (x) — / n+ *(x)| < e/2 for every k = 1,2, ... , and every x in S. There- 
fore, lim*.,^ |/„(x) — f n+k (x) | = |/„(x) -/(x)| < e/2. Hence, n > N implies 
\fn( x ) — f(x) | < e for every x in S. This proves that/, -* f uniformly on S. 

note. Pointwise and uniform convergence can be formulated in the more general 
setting of metric spaces. If/, and / are functions from a nonempty set S to a metric 
space ( T , d T ), we say that/, ->• / uniformly on S, if, for every e > 0, there is an 
N (depending only on e) such that n > N implies 

di{fn( x X A x )) < a for all x in S. 
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Theorem 9.3 is valid in this more general setting and, if S is a metric space, Theorem 
9.2 is also valid. The same proofs go through, with the appropriate replacement 
of the Euclidean metric by the metrics d s and d T . Since we are primarily interested 
in real- or complex-valued functions defined on subsets of R or of C, we will not 
pursue this extension any further except to mention the following example. 

Example. Consider the metric space (B(S), d) of all bounded real-valued functions on a 
nonempty set S , with metric d(fg) = || /- g ||, where ||/|| = sup* eS |/(jc)| is the sup 
norm. (See Exercise 4.66.) Then/, -► / in the metric space ( B(S ), d) if and only if f n -+ / 
uniformly on S . In other words, uniform convergence on S is the same as ordinary con- 
vergence in the metric space ( B(S ), d). 


9.6 UNIFORM CONVERGENCE OF INFINITE SERIES OF FUNCTIONS 

Definition 9.4 . Given a sequence {f n } of functions defined on a set S. For each x in 
S', let 

n 

*„(*) = E /*(*) (#1=1,2,...)- (4) 

k=l 

If there exists a function f such that s„ -*■ f uniformly on S, we say the series Y f jx) 
converges uniformly on S and we write 

^ oo 

S fn( x ) = f(x ) ( uniformly on S). 

n= 1 


Theorem 9.5 ( Cauchy condition for uniform convergence of series). The infinite series 
Hf n ( x ) converges uniformly on S if and only if for every s > 0 there is an N such 
that n > N implies 


n + p 

E /*(*) 


k = n+ 1 


< 


for each p = 1,2,..., and every x in S. 


Proof Define s n by (4) and apply Theorem 9.3. 


Theorem 9.6 ( Weierstrass M-test). Let {M n } be a sequence of nonnegative numbers 
such that 


0 < \f{x)\ < M n9 for n = 1,2,..., and for every x in S. 
Then Z/,(x) converges uniformly on S ifY*M n converges . 

Proof Apply Theorems 8.1 1 and 9.5 in conjunction with the inequality 


n + p 

E /*(*) 


k = n+ 1 


< 


n + p 

E M k . 


k = n+ l 


Theorem 9.7. Assume that Yf n (x) = fix) ( uniformly on S). If each /„ is continuous 
at a point x 0 of S, then f is also continuous at x 0 . 
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Proof. Define s n by (4). Continuity of each f n at x 0 implies continuity of s n at 
x 0 , and the conclusion follows at once from Theorem 9.2. 

note. If x 0 is an accumulation point of S, this theorem permits us to interchange 
limits and infinite sums, as follows: 

00 00 

lim /»(*) = Z) lim /»(*)• 

x~+ xq n— 1 n— 1 x~*xo 


9.7 A SPACE-FILLING CURVE 

We can apply Theorem 9.7 to construct a space-filling curve. This is a continuous 
curve in R 2 that passes through every point of the unit square [0, 1] x [0, 1]. 
Peano (1890) was the first to give an example of such a curve. The example to be 
presented here is due to I. J. Schoenberg ( Bulletin of the American Mathematical 
Society, 1938) and can be described as follows: 

Let (f> be defined on the interval [0, 2] by the following formulas : 

0, if 0 < t < f , or if | < t < 2, 

3t - 1, iff < / < f, 

1, iff < t < t 

— 3 1 +5, if £ < t < f . 

Extend the definition of $ to all of R by the equation 

<j>(t + 2) = (j>(t ). 

This makes <t> periodic with period 2. (The graph of 4> is shown in Fig. 9.5.) 



\~ 4. J | / i ^ +■ i J ~ 


Figure 9.5 


- 2-10 1 


Now define two functions f x and f 2 by the following equations: 




<K3 2 "“ 2 0 




^(3 2 " _1 0 

2 " 


Both series converge absolutely for each real t and they converge uniformly on 
R. In fact, since |$(0I ^ 1 for all t, the Weierstrass M-test is applicable with 
M„ = 2~ n . Since <j) is continuous on R, Theorem 9.7 tells us that /, and f 2 are 
also continuous on R. Let / = (/,,/ 2 ) and let T denote the image of the unit 
interval [0, 1] under /. We will show that T “fills” the unit square, i.e., that 
T = [0, 1] x [0, 1], 

First, it is clear that 0 < fft) < 1 and 0 < f 2 {t) < 1 for each t, since 
, 2"" = 1 . Hence, T is a subset of the unit square. Next, we must show that 
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(a, b) e r whenever (a, b) e [0, 1] x [0, 1], For this purpose we write a and b 
in the binary system. That is, we write 



where each a n and each b„ is either 0 or 1 . (See Exercise 1 .22.) Now let 

00 

, where c 2 „_ 1 = a n and c 2a = b B , n = 1, 2, . . . 

n— 1 3 


Clearly, 0 < c < 1 since 2 , 3 " = 1. We will show that fiic) = a and that 

flic) = b. 

If we can prove that 

$(3 k c) = c k+ i, for each k — 0, 1, 2, ... , (5) 

then we will have <t>(3 2n ~ 2 c) = c 2n - t = a„ and <p(3 2n ~ l c) = c 2n = b„, and this 
will give us fie) = a,f 2 ic) = b. To prove (5), we write 

k oo 

3 k c = 2 Y, + 2 ^ = (an even integer) + d k , 

n = 1 3 n=k + 1 3 

where d k = 2 Y , c n+k j3 n . Since <j> has period 2, it follows that 

<t>o k c) = my 

If c k+l = 0, then we have 0 < d k < 2 ? 3~" = and hence <f>id k ) = 0. 
Therefore, ^(3 k c) = c k+1 in this case. The only other case to consider is c k+ 1 = 1 . 
But then we get f < d k < 1 and hence <f>id k ) — 1. Therefore, 4>i3 k c) = c k+ 1 in 
all cases and this proves that fie) = a,f 2 ic) = b. Hence, T fills the unit square. 


9.8 UNIFORM CONVERGENCE AND RIEMANN-STIELTJES INTEGRATION 


Theorem 9.8. Let a. be of bounded variation on [a, 6]. Assume that each term of 
the sequence {f„} is a real-valued function such that f n e R( a) on [a, h] for each 
n = 1,2,... Assume thatf n -*■ f uniformly on [a, h] and define g„ix) = J* f n (t) daft) 
if x e [a, h], n = 1,2,... Then we have: 

a) / e Rig) on [a, 6]. 

b) g„ -* g uniformly on [a, 6], where gix) = J - * f(t) da(t). 

note. The conclusion implies that, for each x in [a, h], we can write 


lim 

n~* oo 


Lit) da(t) = 


Ja 


f lim f n {t) da(t). 


n-+ oo 


This property is often described by saying that a uniformly convergent sequence 
can be integrated term by term. 
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Proof \ We can assume that a is increasing with ai(a) < a (b). To prove (a), we 
will show that /satisfies Riemann’s condition with respect to a on [a, b\ (See 
Theorem 7.19.) 

Given e > 0, choose N so that 


l/(x) - fs(x)\ 


8 

3[a(6) - a(a)] ’ 


for all x in [a, b\ 


Then, for every partition P of [o, Z>], we have 

I V(P, f - f N , a)j < | and | L(P, f - f N , a)| < | , 


(using the notation of Definition 7.14). For this N, choose P s so that P finer than 
P e implies U(P,f N , a) — L(P,f N , a) < e/3. Then for such P we have 

U(P, f, a) - UP, f, a) < U(P, f - f N , a) - UP, f - f N , a) 


+ U(P, f N , a) - UP, h, «) 


< \U{P,f - f N , a)| + | UP,f - U «)l + | < e- 
This proves (a). To prove (b), let e > 0 be given and choose N so that 


i/„(o - m\ < 


2[a (b) - a(o)] ’ 
for all n > N and every t in [a, b~\. If x e [a, b\ we have 


- 5WI 


£ f !/.«> - 


m MO £ i < i < 8 . 

a (b) - a (a) 2 2 


This proves that g n g uniformly on [a, b\ 

Theorem 9.9 . Let a be of bounded variation on [a, h] and assume that £/(*) = f(x) 
(i uniformly on [ a 9 h]), where each f n is a real-valued function such that f n e R( a) on 
[a, b]. Then we have: 

a ) fe R( a) on [< a , b\ 

b ) Jfl Z B °°= 1 m da(t) = X “=1 j5/,(0 da{t) {uniformly on [a, bj). 

Proof Apply Theorem 9.8 to the sequence of partial sums. 

note. This theorem is described by saying that a uniformly convergent series 
can be integrated term by term. 


9.9 NONUNIFORMLY CONVERGENT SEQUENCES THAT CAN BE 
INTEGRATED TERM BY TERM 

Uniform convergence is a sufficient but not a necessary condition for term-by- 
term integration, as is seen by the following example. 
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Example. Let f n (x) = jc"ifO<jc<l. (See Fig. 9.6.) The limit function /has the value 
0 in [0, 1) and /( 1) = 1. Since this is a sequence of continuous functions with discon- 
tinuous limit, the convergence is not uniform on [0, 1 ]. Nevertheless, term-by-term 
integration on [0, 1 ] leads to a correct result in this case. In fact, we have 




x* dx 


so lim^oo Jo fnto dx = Jo f(x) dx = 0. 


► 0 as n -> oo, 

n + 1 


The sequence in the foregoing example, although not uniformly convergent 
on [0, 1], is uniformly convergent on every closed subinterval of [0, 1] not con- 
taining 1 . The next theorem is a general result which permits term-by-term inte- 
gration in examples of this type. The added ingredient is that we assume that {/„} 
is uniformly bounded on [a, 6] and that the limit function / is integrable. 

Definition 9.10. A sequence of functions {f n } is said to be boundedly convergent on 
T if {/„} is pointwise convergent and uniformly bounded on T. 

Theorem 9.11. Let {/„} be a boundedly convergent sequence on [a, 6]. Assume that 
eachf n e R on [a, 6], and that the limit function f e R on [ a , b\ Assume also that 
there is a partition P of [a, b\ say 


P = {x 0 , * 




x m }> 


such that , on every subinterval [c, </] not containing any of the points x k , the sequence 
{/„} converges uniformly to f Then we have 


lim 

n~* cc 



m b 

lim f„(t) dt 

„ n-+ oo 


r 

Ja 


m dt. 



Proof. Since / is bounded and {/„} is uniformly bounded, there is a positive 
number M such that |/(x)| <; M and |_/j,(jc)| < M for all x in [a, h] and all 
n > 1. Given e > 0 such that 2s < ||P||, let h = e/(2m), where m is the number 
of subintervals of P, and consider a new partition P' of [a, h] given by 


P' = {x 0 , *o + K x i - h,x i + h, ... , x m _! - h, x m _! + h, x m — h, x m }. 


Since \f — f n \ is integrable on [a, h] and bounded by 2M, the sum of the integrals 
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of 1/ “ f n \ taken over the intervals 

[x 0 , x 0 + hi], [>! - h, Xj + h], ..., [x m _ ! - h, x m _ , + h], [ x m - h, x m ], 

is at most 2M(2mh) = 2 Me. The remaining portion of [a, b] (call it S) is the 
union of a finite number of closed intervals, in each of which {/„} is uniformly 
convergent to /. Therefore, there is an integer N (depending only on s) such that 
for all x in S we have 


l/C*) ~ /„(*)! < e whenever n > N. 

Hence the sum of the integrals of \f — f n \ over the intervals of S is at most e(b — a), 
so 

j. l/w - / - WI dx < (2 M + b — a ) b whenever rt > N. 

This proves that J* /„(x) dx -* J*/(x) dx as n -*■ oo. 

There is a stronger theorem due to Arzela which makes no reference whatever 
to uniform convergence. 


Theorem 9.12 (Arzela). Assume that {/,} is boundedly convergent on \a,b] and sup- 
pose each f„ is Riemann-integrable on [a, b]. Assume also that the limit function 
f is Riemann-integrable on [a, b]. Then 


lim ffx) dx = 

n-> co „ 

V “ 



lim /„(x) dx 

n~* oo 



fix) dx. 



The proof of Arzela’s theorem is considerably more difficult than that of 
Theorem 9.11 and will not be given here. In the next chapter we shall prove a 
theorem on Lebesgue integrals which includes Arzela’s theorem as a special case. 
(See Theorem 10.29). 


note. It is easy to give an example of a boundedly convergent sequence {/,} 
of Riemann-integrable functions whose limit / is not Riemann-integrable. If 
{rj, r 2 , . . . } denotes the set of rational numbers in [0, 1], define f„(x) to have the 
value 1 if x = r k for all k = 1,2 ,...,«, and put f n (x) = 0 otherwise. Then the 
integral lof„{x)dx = 0 for each n, but the pointwise limit function / is not 
Riemann-integrable on [0, 1]. 


9.10 UNIFORM CONVERGENCE AND DIFFERENTIATION 

By analogy with Theorems 9.2 and 9.8, one might expect the following result to 
hold: I ff n -*f uniformly on [a, b] and if f' n exists for each n, then /' exists and 
f'„ -> f uniformly on [a, b]. However, Example 3 of Section 9.2 shows that this 
cannot be true. Although the sequence {/,} of Example 3 converges uniformly on 
R, the sequence {/'} does not even converge pointwise on R. For example, 

{/n(0)} diverges since f' n { 0) = v«. Therefore the analog of Theorems 9.2 and 
9.8 for differentiation must take a different form. 
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Theorem 9.13. Assume that each term of {/„} is a real-valued function having a 
finite derivative at each point of an open interval ( a , b). Assume that for at least one 
point x 0 in (a, b ) the sequence {/ n (x 0 )} converges. Assume further that there exists 
a function g such that f' n -*g uniformly on (a, b ). Then: 

a) There exists a function f such that f„-*f uniformly on (a, b). 

b) For each x in (a, b ) the derivative f'(x) exists and equals g{x). 

Proof Assume that c e (a, b) and define a new sequence {g n } as follows: 


9„(x) 


L(x) - fjjc) 

X — c 
fn(c) 


if x ^ c, 
if x = c. 



The sequence {g n } so formed depends on the choice of c. Convergence of {g n (c)} 
follows from the hypothesis, since g„(c) = f'„{c). We will prove next that {</„} 
converges uniformly on ( a , b ). If x ^ c, we have 


9n(x) - gjx) = — — , (9) 

X — c 

where h(x) = f„(x) — f„(x). Now h'(x) exists for each x in (a, b ) and has the value 
f„(x) — fm(x)- Applying the Mean-Value Theorem in (9), we get 

9n(x) - 9 m (x) =f'„(x l ) - f„(x x ), (10) 

where Xj lies between x and c. Since {/„'} converges uniformly on (a, b ) (by hy- 
pothesis), we can use (10), together with the Cauchy condition (Theorem 9.3), 
to deduce that {g„} converges uniformly on (a, b). 

Now we can show that {f„} converges uniformly on (a, b). Let us form the 
particular sequence {g„} corresponding to the special point c = x 0 for which 
{/ n (x 0 )} is assumed to converge. From (8) we can write 

fn(x) = f„(x o) + (x - X 0 )g„(x), 
an equation which holds for every x in (a, b ). Hence we have 

fnix) fm(x') ,/n(x o) fm(Xo) + (X Xq)^^„(x) 

This equation, with the help of the Cauchy condition, establishes the uniform 
convergence of {f„} on (a, b). This proves (a). 

To prove (b), return to the sequence {g„} defined by (8) for an arbitrary point 
c in (a, b) and let (7(x) = lim,,^ g„(x). The hypothesis that /„' exists means that 
lim x _ c g n {x) = g n {c). In other words, each g„ is continuous at c. Since g„ -*• G 
uniformly on (a, b), the limit function G is also continuous at c. This means that 


G(c ) = lim G(x), 


(U) 
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the existence of the limit being part of the conclusion. But, for x ^ c, we have 

G(x) = lira g„(x) = lira f&LlMd = K x ) . ~JiF} . 

n~* oo n~* oo X — C X — C 

Hence, (1 1) states that the derivative f'(c) exists and equals G(c). But 

G(c) = lim g„(c) = lim /'(c) = g(c ) ; 

n~* oo n-+ oo 

hence f'(c) = g(c ). Since c is an arbitrary point of (a, b ), this proves (b). 

When we reformulate Theorem 9.13 in terms of series, we obtain 

Theorem 9.14 . Assume that each f n is a real-valued function defined on (a, b ) such 
that the derivative f' n (x) exists for each x in (a, b ). Assume that , /nr at least one 
point x 0 (a, b ), f/ie ser/ej Z/,(x 0 ) converges . Assume further that there exists a 
function g such that Y,f' n ( x ) — &( x ) ( uniformly on (a, 6)). Then : 

a) 77zere exists a function f such that £/,(x) = /(x) ( uniformly on (a, b )). 

b) If x e (a, b\ the derivative f'{x) exists and equals Z/n(x). 


9.11 SUFFICIENT CONDITIONS FOR UNIFORM CONVERGENCE OF 
A SERIES 

The importance of uniformly convergent series has been amply illustrated in some 
of the preceding theorems. Therefore it seems natural to seek some simple ways of 
testing a series for uniform convergence without resorting to the definition in each 
case. One such test, the Weierstrass M-test , was described in Theorem 9.6. There 
are other tests that may be useful when the Af-test is not applicable. One of these 
is the analog of Theorem 8.28. 

Theorem 9.15 ( Dirichlefs test for uniform convergence). Let F n {x) denote the nth 
partial sum of the series Z/,(x), where each f n is a complex-valued function defined 
on a set S . Assume that {F n } is uniformly bounded on S. Let {g n } be a sequence of 
real-valued functions such that g n+ x (x) < g„(x) for each x in S and for every 
n — 1,2,..., and assume that g n 0 uniformly on S . Then the series X f n ( x )9n( x ) 
converges uniformly on S. 

Proof Let ^(x) = Z"=i fk( x )0k( x )- ®y partial summation we have 

n 

s„(x) = X] F k(x)(g k (x) - fttlW) + 9 n+i(x)F„(x), 

*= 1 

and hence if n > m, we can write 

n 

s n(x) - S m (x) = 2 F k(x)(g k (x) ~ g k+ i(x)) + g n+ i(x)F„(x) - g m + , (x)F m (x). 

k = m+ 1 
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Therefore, if M is a uniform bound for {FJ, we have 


k(x) - s m (x)| < M 2 (&(*) - ShiW) + Mg n+l (x) + Mg m + l (x) 

k = m+ 1 

= M(g m+l (x) - g n+l (x)) + Mg n+i (x) + Mg m+i (x ) 

= 2 Mg m+1 (x). 

Since g n -* 0 uniformly on S , this inequality (together with the Cauchy condition) 
implies that Jlf n (x)g n (x) converges uniformly on S. 

The reader should have no difficulty in extending Theorem 8.29 (Abel’s test) 
in a similar way so that it yields a test for uniform convergence. (Exercise 9.13.) 

Example. Let F w (*) = XZ=i e ' kx • In the last chapter (see Theorem 8.30), we derived the 
inequality |F rt (x)| < l/|sin (*/2)|, valid for every real x ^ 2mn (m is an integer). There- 
fore, if 0 < 8 < 7r, we have the estimate 


|F b (jc)| < 1/sin (8/2) \f S < x < 2n - S. 


Hence, {F n } is uniformly bounded on the interval [3, 2n — 8], If {g n } satisfies the condi- 
tions of Theorem 9.15, we can conclude that the series ^g„(x)e inx converges uniformly 
on [<5, 2n — 8]. In particular, if we take g n (x) = l/« , this establishes the uniform con- 
vergence of the series 


z 

Mr- 1 


e inx 

n 


on [<5, 2n — 3] if 0 < d < n. Note that the Weierstrass M-test cannot be used to estab- 
lish uniform convergence in this case, since \e iax \ = 1. 


9.12 UNIFORM CONVERGENCE AND DOUBLE SEQUENCES 

As a different type of application of uniform convergence, we deduce the following 
theorem on double sequences which can be viewed as a converse to Theorem 8.39. 

Theorem 9.16. Let f be a double sequence and let Z + denote the set of positive 
integers. For each n — 1,2,..., define a function g„ on Z + as follows: 

9n( m ) = f(m, n), if me Z + . 

Assume that g„ -* g uniformly on Z + , where g(m) = lim n _ 00 f(m, n). If the iterated 
limit lim m _ 00 (lim„^ Q0 /(w, «)) exists, then the double limit lim m n _ x /(w, n) also 
exists and has the same value. 

Proof. Given e > 0, choose jV, so that n > N x implies 

I f(m, n) - g(m)\ < ~ , 


for every m in Z + . 
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Let a = (lim n _>*,/(#!, n)) = lim,*^ g(m). For the same e, choose N 2 so 

that m > N 2 implies | g(m) — a\ < e/2. Then, if N is the larger of N i and N 2 , we 
have | f(m, ri) — a\ < s whenever both m > N and n > N. In other words, 
lim m#n ->oo/( w , 


9.13 MEAN CONVERGENCE 

The functions in this section may be real- or complex-valued. 

Definition 9.17 Let {f n } be a sequence of Riemann-integrable functions defined on 
[a, b]. Assume that f e R on [a, 6]. The sequence {/„} is said to converge in the 
mean to f on [a, 6], and we write 



l.i.m./ n =/ on [ a , 6], 

oo 



!/«(*) - f{x ) | 2 dx = 0. 


If the inequality |/(x) — /„(x:)| < e holds for every x in [a, A], then we have 
fa I/O*) — f n {x ) | 2 < e 2 (Z> — a). Therefore, uniform convergence of {/,} to / 

on [a, 6] implies mean convergence, provided that each f„ is Riemann-integrable 
on [a, 6]. A rather surprising fact is that convergence in the mean need not imply 
pointwise convergence at any point of the interval. This can be seen as follows: 
For each integer n > 0, subdivide the interval [0, 1] into 2" equal subintervals 
and let I 2 » +k denote that subinterval whose right endpoint is (k + l)/2", where 
k = 0, 1, 2, . . . , 2" — 1. This yields a collection {I u I 2 , . ■ . } of subintervals of 
[0, 1], of which the first few are : 


A = [o, l], i 2 = [o, i], / 3 = [*, l], 

h = [o, i], h = Ih *], h = Ih I], 

and so forth. Define/, on [0, 1] as follows: 

f( r \ f ^ *f * 6 Ini 

Jn{ ) }0 ifxe [0, 1] - /„. 


Then {/„} converges in the mean to 0, since fi |/ B (x)| 2 dx is the length of /„, and 
this approaches 0 as n -* oo. On the other hand, for each x in [0, 1] we have 


lim sup/„(x) = 1 

n-*oo 


and lim inf f n {x) = 0. 

n~* oo 


[Why?] Hence, {/,(*)} does not converge for any jc in [0, 1]. 

The next theorem illustrates the importance of mean convergence. 



Th. 9.19 


Mean Convergence 


233 


Theorem 9.18 . Assume that l.Lm.^nfn = f on [a, b\ If g e R on [a, Z>], define 


ex 


h(x) = 


f(t)g(t) dt, h n (x) = fn(t)g(t) dt. 


= /„< 

Ja 


ifxe [a, 6]. Then h„ -* h uniformly on [a, b\ 
Proof. The proof is based on the inequality 


0 < 


1/(0 - m \ i^coi ^ 


i/(o - m \ 2 dt 


\9(t)\ 2 dt , 


( 12 ) 


which is a direct application of the Cauchy-Schwarz inequality for integrals. (See 
Exercise 7.16 for the statement of the Cauchy-Schwarz inequality and a sketch of 
its proof.) Given e > 0, we can choose N so that n > N implies 


I 


1/(0 - /„(0I 2 dt < 


(13) 


where A = 1 + J* \g{t)\ 2 dt. Substituting (13) in (12), we find that n > N implies 
0 < |/*(x) — A„(x)| < e for every jc in [a, b\ 

This theorem is particularly useful in the theory of Fourier series. (See Theorem 
11.16.) The following generalization is also of interest. 

Theorem 9.19. Assume that l.i.m.,,.,^/,, = / and l.i.m ^ = g on [a, b~\. 
Define 


Jc 


h(x) = f(t)g{t) dt, h„(x) 




f n (t)g„(t) dt. 


if x e [a, b\. Then h„ -* h uniformly on \a, b\ 


Proof We have 
K(x) - h(x) = 


ex 


(/ - fn)(9 - 9n) dt 


+ 


f«9 dt - 


rx 


fg dt \ + 


f9n dt 


Ja 


-l 


fg dt 


Applying the Cauchy-Schwarz inequality, we can write 


0 < 


I / “ fn\ 1 9 ~ 9n\ dt ) < 


rb 


I / - /„ I 2 dt 


X 


rb 


1 9 ~ 9n\ 2 dt 


The proof is now an easy consequence of Theorem 9.18. 



234 


Sequences of Functions 


Th. 9.20 


9.14 POWER SERIES 
An infinite series of the form 


do + X a "( z ~ z o )"> 

n= 1 

written more briefly as 

oo 

E a n (z - z 0 y, (14) 

n = 0 

is called a power series in z — z 0 . Here z, z 0 , and a„(n = 0, 1 , 2, . . . ) are complex 
numbers. With every power series (14) there is associated a disk, called the disk 
of convergence, such that the series converges absolutely for every z interior to 
this disk and diverges for every z outside this disk. The center of the disk is at z 0 
and its radius is called the radius of convergence of the power series. (The radius 
may be 0 or + oo in extreme cases.) The next theorem establishes the existence of 
the disk of convergence and provides us with a way of calculating its radius. 

Theorem 9.20. Given a power series £“=o a n ( z ~ z o )"> /ef 

X = lim sup \!\a n \, r = j , 

n —* oo A 

{where r = 0 if X — + oo and r = -f oo if X — 0). Then the series converges 
absolutely if \z — z 0 \ < r and diverges if | z — z 0 \ > r. Furthermore , the series 
converges uniformly on every compact subset interior to the disk of convergence . 

Proof Applying the root test (Theorem 8.26), we have 

lim sup $ |a„(z - z 0 )"| = ^ ~ Z ° - , 

n~* oo r 

and hence YfifyZ — z 0 ) n converges absolutely if |z — z 0 | < r and diverges if 
I z - Zq\ > r. 

To prove the second assertion, we simply observe that if T is a compact subset 
of the disk of convergence, there is a point p in T such that z e T implies 

\z - z 0 1 < I p - z 0 1 < r. 

Hence, \a„(z — z 0 )"| < \a„(p — z 0 ) n | for each z in T, and the Weierstrass M-test 
is applicable. 

noth. If the limit lim,,.,*, \aja tt+l \ exists (or if this limit is +oo), its value is also 
equal to the radius of convergence of (14). (See Exercise 9.30.) 

Example 1. The two series X*=o z " and j z"/n 2 have the same radius of convergence, 
namely, r = 1. On the boundary of the disk of convergence, the first converges nowhere, 
the second converges everywhere. 
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Example 2. The series z n /n has radius of convergence r = 1, but it does not con- 
verge at z = 1. However, it does converge everywhere else on the boundary because of 
Dirichlet’s test (Theorem 8.28). 

These examples illustrate why Theorem 9.20 makes no assertion about the be- 
havior of a power series on the boundary of the disk of convergence. 

Theorem 9.21. Assume that the power series ££°=o a n {z — z 0 )" converges for each 
z in B(z 0 ; r). Then the function f defined by the equation 

00 

/(z) = T, a »( z - z 0 )"> if 2 e B(z 0 ; r), (15) 

w = 0 

is continuous on B(z 0 ; r). 

Proof Since each point in B(z 0 ; r) belongs to some compact subset of B{z 0 \ r), 
the conclusion follows at once from Theorem 9.7. 

note. The series in (15) is said to represent f in B(z 0 ; r). It is also called a power 
series expansion of / about z 0 . Functions having power series expansions are 
continuous inside the disk of convergence. Much more than this is true, however. 
We will later prove that such functions have derivatives of every order inside the 
disk of convergence. The proof will make use of the following theorem : 

Theorem 9.22. Assume that £a n (z — z 0 ) n converges ifze B(z 0 ; r). Suppose that 
the equation 

00 

f( Z ) = Tj °n( 2 - 2 of, 
n = 0 

is known 'to be valid for each z in some open subset S of B(z 0 ; r). Then , for each 
point z l in S, there exists a neighborhood B{z x ; R) ^ S in which f has a power 
series expansion of the form 

oo 

/(z) = 2 b k( 2 - 2 if’ (16) 

k = 0 

where 

bk = T, (?) a n( z 1 - 2 o) n ~ k (k = 0,1,2,... ). (17) 

n = k \kj 

Proof If z e S, we have 

oo oo 

/(z) = a n( Z - Z 0 )" = Z) a n( 2 ~ Z 1 + 2 1 - Z of 
n=0 n = 0 

-t'a.thi 2 -^- 2 of- k 

0 k= 0 \kj 

00 00 

= EE C n(k), 

n - 0 k = 0 
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where 

, (t) J(i) ^ 

(0, if k > n. 

Now choose R so that B{z i ; R) c s and assume that z e B(z 1 ; R). Then the 
iterated series Yf -/. Yf Ln c n (k) converges absolutely, since 


OO oo 


E E MV I = 

H-n ir— n 


00 00 

E l a »l(l Z ~ Z ll + I Z 1 ~ Z ol)" = E l a »l ( Z 2 - Z o)", 


n = 0 


n — 0 



where 

But 


z 2 = z 0 + \z - Zj I + |Zj - z 0 |. 
|z 2 - z 0 | < + |zj - z 0 | < r, 


and hence the series in (18) converges. Therefore, by Theorem 8.43, we can inter- 
change the order of summation to obtain 


00 


00 


00 


00 


/CO = Z E C n( k ) = X X ( ” ) - Z l)*( Z l - Z o)" 


-fc 


* = 0 n = 0 


fc = 0 n = 


= E - z i)*> 

k = 0 

where b k is given by (17). This completes the proof. 

note. In the course of the proof we have shown that we may use any R > 0 that 
satisfies the condition 

B(z i; R)<zS. (19) 


Theorem 9.23. Assume that £a„(z — z 0 )" converges for each z in B(z 0 ; r). Then 
the function f defined by the equation 


f(z) = E a n( z ~ z o )"» if * 6 B ( z ol r), (20) 

n = 0 

has a derivative f\z) for each z in B(z 0 \ r), given by 

00 

f( z ) = E na «( z - z 0 )" _1 - (21) 

n= 1 

note. The series in (20) and (21) have the same radius of convergence. 

Proof Assume that z, e B(z 0 ; r) and expand / in a power series about z u as 
indicated in (16). Then, if z 6 2?(z, ; R), z =£ z,, we have 


/(z) - /(z,) 


00 


bi + E b k+i( z ~ z it • 


fc=l 


z 


( 22 ) 
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By continuity, the right member of (22) tends to b t as z -> z v Hence, /'(z^ exists 
and equals b v Using (17) to compute b u we find 

00 

b 1 = 23 - Zo)"" 1 . 

n— 1 

Since z x is an arbitrary point of B(z Q ; r), this proves (21). The two series have the 
same radius of convergence because \n -*• 1 as n -*• oo. 

note. By repeated application of (21), we find that for each k = 1,2,..., the 
derivative / w (z) exists in B(z 0 \ r) and is given by the series 


/ W (z) 


00 


= 23 


n 


n=k (n — k)\ 


«n(z 




If we put z = z 0 in (23), we obtain the important formula 

f (k \z 0 ) = k\a k (k = 1,2,...). (24) 


This equation tells us that if two power series £a„(z — z 0 ) n and Y.b„(z — z 0 ) n both 
represent the same function in a neighborhood B(z 0 ; r), then a„ = b„ for every n. 
That is, the power series expansion of a function / about a given point z 0 is uniquely 
determined (if it exists at all), and it is given by the formula 


00 


/(z) = X] 


/ (B) (z 0 ) 


n = 0 


n\ 


(z - z 0 )\ 


valid for each z in the disk of convergence. 


9.15 MULTIPLICATION OF POWER SERIES 

Theorem 9.24. Given two power series expansions about the origin, say 


00 


/(z) = 23 a n z "> z G B(0; r), 

w = 0 


and 


00 


0 (z) = 23 6 "Z B , */ z e B( 0 ; /?). 


n = 0 


Then the product f(z)g(z) is given by the power series 


00 


f(z)g(z) = 23 c » z "’ if z e B(0; r) n B(0; R), 


n = 0 


where 


n 


Cfi ^ 1 ^k^n — k 0 , 1 , 2 ,...). 


k — 0 
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Proof. The Cauchy product of the two given series is 


±(£vw‘) 

n = 0 \k - 0 J 


qo 


E v, 


n — 0 


and the conclusion follows from Theorem 8.46 (Mertens’ Theorem). 
note. If the two series are identical, we get 


/o ) 2 = E c « z "’ 

/! = 0 

where c„ = S =0 "A-i = Zm 1+ m 2 = n a mi a m2 . The symbol Z mi+m2=n indicates 
that the summation is to be extended over all nonnegative integers m x and m 2 
whose sum is n. Similarly, for any integer p > 0, we have 

m p = E cjlpv, 

n = 0 

where 

Cn(p) = E ««,••• (*m p - 


9.16 THE SUBSTITUnON THEOREM 

Theorem 9.25 . Given two power series expansions about the origin , say 

oo 

/0) = E a n z "> z 6 B (°; *•), 

n = 0 

and 

oo 

30) = E f z e fi (°; *)• 

n — 0 

If for a fixed z in B( 0; /?), we l^n^l < then for this z we can write 

/i>0)] = E c *z k > 

Ac = 0 

where the coefficients c k are obtained as follows: Define the numbers b k (n) by the 
equation 

30)" = (E 6 * z *) = E b k(n)z k . 

\*= o / 4=0 

Then c k = Y.n=o a„b k (n) for k = 0, 1, 2, . . . 

note. The series Y” =n c t z* is the power series which arises formally by substituting 
the series for g(z) in place of z in the expansion of / and then rearranging terms in 
increasing powers of z. 
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Proof. By hypothesis, we can choose z so that |&„z n | < r. For this z we have 
\g{z)\ < r and hence we can write 


/[ffO )] = Z a M z T = E Z a nb k (n)z k . 

n = 0 n = 0 fc= 0 

If we are allowed to interchange the order of summation, we obtain 

/l>(z)] = Z fZ a A(")) z* = Z c * z *> 

fc = 0 y« = 0 / k = 0 

which is the statement we set out to prove. To justify the interchange, we will 
establish the convergence of the series 


00 oo 


00 


00 


Z Z |aA(")z‘l = Z Kl Z IW")z k |. 


11 = 0 fc = 0 n = 0 k = 0 

Now each number b k {n ) is a finite sum of the form 

b k (n) = Z *»,••• &m„> 

mi + ••• +m n = k 

and hence |h*(n)| < £ mi + ... +m „=* |h m ,| • • • |h m J. On the other hand, we have 


(25) 


' 00 \n «> 

Z IW**) = E 

k = 0 J k = 0 


B k (n)z k , 


where B k (n ) = Zm,+ • •• + m „=k A,l ' ‘ • \b„ n \. Returning to (25), we have 

Z l«J Z A(«)z*l ^ Z Kl Z B k (n)\z k \ = J) |a„| f £ Az k |Y, 

n-0 k = 0 n = 0 fc = 0 « = 0 yfc = 0 y 

and this establishes the convergence of (25). 


9.17 RECIPROCAL OF A POWER SERIES 

As an application of the substitution theorem, we will show that the reciprocal of 
a power series in z is again a power series in z, provided that the constant term is 
not 0. 


Theorem 9.26. Assume that we have 


P( z ) = Z P»z", if ze B (0; b), 

11 = 0 

where p{ 0) # 0. Then there exists a neighborhood 5(0; S) in which the reciprocal of 
p has a power series expansion of the form 


1 

P( z ) 



Furthermore, q 0 = 1 /p 0 . 
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Proof. Without loss in generality we can assume that p 0 = 1. [Why?] Then 
p(0) = 1. Let P(z) = 1 + YfL, \p„z n \ if z e 5(0; h). By continuity, there exists 
a neighborhood 5(0; d) such that |5(z) — 1| < 1 if z e 5(0; 8). The conclusion 
follows by applying Theorem 9.25 with 

1 00 °° 

f(z) = = 53 z " and 9(z) = 1 - P( z ) = 2 Pn 2 "- 

1 — Z n = 0 n= 1 


9.18 REAL POWER SERIES 

If x, x 0 , and a n are real numbers, the series £a„(x — x 0 )" is called a real power 
series. Its disk of convergence intersects the real axis in an interval (x 0 — r, x 0 + r) 
called the interval of convergence. 

Each real power series defines a real-valued sum function whose value at each 
x in the interval of convergence is given by 

00 

/(*) = !Lj °n(x - x 0 y. 

n — 0 

The series is said to represent f in the interval of convergence, and it is called a 
power-series expansion of / about x 0 . 

Two problems concern us here: 

1) Given the series, to find properties of the sum function /. 

2) Given a function f to find whether or not it can be represented by a power 
series. 

It turns out that only rather special functions possess power-series expansions. 
Nevertheless, the class of such functions includes a large number of examples that 
arise in practice, so their study is of great importance. 

Question (1) is answered by the theorems we have already proved for complex 
power series. A power series converges absolutely for each x in the open subinterval 
(x 0 — r, x 0 + r) of convergence, and it converges uniformly on every compact 
subset of this interval. Since each term of the power series is continuous on R, the 
sum function /is continuous on every compact subset of the interval of convergence 
and hence / is continuous on ( x 0 — r, x 0 + r). 

Because of uniform convergence. Theorem 9.9 tells us that we can integrate a 
power series term by term on every compact subinterval inside the interval of con- 
vergence. Thus, for every x in (x 0 — r, x 0 + r) we have 

P7(0 dt=Y i a n r ( t - x 0 f dt-Z (X - X 0 )" +1 . 

Jxo »=° J*o n=o n + 1 

The integrated series has the same radius of convergence. 

The sum function has derivatives of every order in the interval of convergence 
and they can be obtained by differentiating the series term by term. Moreover, 
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f (n \xo) = n \a n so the sum function is represented by the power series 

/•("Vv \ 

fix) = E (X - x 0 y. (26) 

n=0 rt! 

We turn now to question (2). Suppose we are given a real-valued function / 
defined on some open interval ( x 0 — r, x 0 + r), and suppose /has derivatives of 
every order in this interval. Then we can certainly form the power series on the 
right of (26). Does this series converge for any x besides x = x 0 ? If so, is its sum 
equal to /(*)? In general, the answer to both questions is “No.” (See Exercise 
9.33 for a counter example.) A necessary and sufficient condition for answering 
both questions in the affirmative is given in the next section with the help of 
Taylor’s formula (Theorem 5.19.) 


9.19 THE TAYLOR’S SERIES GENERATED BY A FUNCTION 

Definition 9.27. Let fbe a real-valued function defined on an interval I in R. If f has 
derivatives of every order at each point of /, we write f e C® on I. 

If/ e C® on some neighborhood of a point c, the power series 


f (n \c) 

n^o n ! 


(X - c)\ 


is called the Taylor's series about c generated by /. To indicate that / generates 
this series, we write 


00 


/(*) ~ E ^ <* - cf. 


n — 0 n\ 


The question we are interested in is this: When can we replace the symbol by 
the symbol = ? Taylor’s formula states that if/e C® on the closed interval [a, 6] 
and if c e [a, 6], then, for every x in [a, 6] and for every n, we have 


n - 1 


f (k \c) 


fix) = E S-T 2 (* - c) k + ix - cf. 


/ (n) (x,) 


k = 0 k\ 


n ! 


(27) 


where x, is some point between x and c. The point x l depends on x, c, and on n. 
Hence a necessary and sufficient condition for the Taylor’s series to converge to 
fix) is that 






In practice it may be quite difficult to deal with this limit because of the unknown 
position of x t In some cases, however, a suitable upper bound can be obtained 
for/ ( ' ,) (X]) and the limit can be shown to be zero. Since A n /n ! -*• 0 as n -*• oo for 
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all A, equation (28) will certainly hold if there is a positive constant M such that 

l/ (B) (x)| < M”, 

for all x in [a, 6]. In other words, the Taylor’s series of a function / converges if 
the nth derivative / (B) grows no faster than the nth power of some positive number. 
This is stated more formally in the next theorem. 


Theorem 9.28. Assume that f e C°° on [a, 6] and let c e [a, b\ Assume that there 
is a neighborhood B(c ) and a constant M ( which might depend on c ) such that 
|/ (n) (x)| < M” for every x in B(c ) n [a, 6] and every n = 1,2,... Then, for 
each x in B(c ) n l>> b \ we have 


GO 


fix) = £ 


f in \c) 


(x 


n = 0 


n 



9.20 BERNSTEIN’S THEOREM 

Another sufficient condition for convergence of the Taylor’s series of f formulated 
by S. Bernstein, will be proved in this section. To simplify the proof we first obtain 
another form of Taylor’s formula in which the error term is expressed as an 
integral. 


Theorem 9.29. Assume f has a continuous derivative of order n -f 1 in some open 
interval I containing c, and define E„(x) for x in I by the equation 


/w - + «*>■ 

k=o k\ 


(29) 


Then E„(x) is also given by the integral 


E„(x) = ^ 
n! 


(x - /)"/ (n+1) (0 dt. 


(30) 


Proof The proof is by induction on n. For n = 1 we have 


Ei (x) = /(x) - /(c) - /'(c)(x - c) 


[AO - /' 


(c)] dt = 


rx 


u(t) dv(t). 


where u(t) = f(t) — /'(c) and v(t) = t — x. Integration by parts gives 

rx 


u(t) dv(t) = u(x)v(x) — u(c)v(c) 


JC 


~i 


v(t) du(t) = 


(x - t)f\t) dt. 


This proves (30) for n = 1. Now we assume (30) is true for n and prove it for 
n + 1 . From (29) we have 

E„ + ,(x) = E n (x) - J - \ C f (x - c)" +1 . 

(n + 1)! 
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We write E„(x) as an integral and note that (x — c) n+l = (n + 1) J* (x — t) n dt 
to obtain 

E n+l (x) = - r (x - typ n+1 \t) dt - f* ( X - ty dt 

«!Jc «! Jc 

= r (x - ty [/ (b+i) (o - / (n+i) ( C )] dt « i r U (t) dv(t ), 

«!Jc »!jc 

where u(t) = / (n+1) (t) — / (n+1) (c) and u(/) = — (x — t)" +1 /(n + 1). Integration 
by parts gives us 


E„ +1 (x) = 


= _ 1 f ; 

«! . 

4/ t 


h-il 


lit) du(t) = - — I (x - 0” +1 / <B+2) (0 dt. 

(n + 


This proves (30). 


note. The change of variable t = x + (c — x)u transforms the integral in (30) 
to the form 


E„(x) = 


(x - c) 


«+ 1 


r 1 


n\ 


m"/(" +1) [x + (c — x)m] du. 


(31) 


Theorem 930 (Bernstein). Assume f and all its derivatives are nonnegative on a 
compact interval [ b , b + r]. Then, if b < x < b + r, the Taylor's series 


f W (b) 

k = 0 k\ 


(X - bf , 


converges to f(x). 


Proof. By a translation we can assume b — 0. The result is trivial if x = 
we assume 0 < x < r. We use Taylor’s formula with remainder and write 

m ,± im x > + £<w . 

k = 0 kl 

We will prove that the error term satisfies the inequalities 

0< E„(x) < 0J +1 /(O- 

This implies that E m (x) -> 0 as n -* 00 since (x/r)" + 1 -> 0 if 0 < x < r. 

To prove (33) we use (31) with c = 0 and find 


0 so 


(32) 


(33) 


E„(x) = 


.«+ 1 


n\ 


r 1 


u"f (n+l \x — xu ) du. 


for each x in [0, r]. If x # 0, let 


p„(x) = = i r „-/(-+ o (x _ xu) dU ' 

x nl Jo 
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The function / (B+1) is monotonic increasing on [0, r] since its derivative is non- 
negative. Therefore we have 

j ( n+i)( x _ xu ) _ y( " +1 >[ x (i — M )] < / ( " + 1 ) [r(l — «)], 

if 0 <, u < 1, and this implies F„( x) < F„(r ) if 0 < x < r. In other words, 
E n(x)lx" +1 < E n (r)/r n+1 , or 

E„(x) < Q" +1 E„(r ). (34) 

Putting x = r in (32), we see that E„(r) < fir) since each term in the sum is 
nonnegative. Using this in (34), we obtain (33) which, in turn, completes the proof. 

9.21 THE BINOMIAL SERIES 

As an example illustrating the use of Bernstein’s theorem, we will obtain the fol- 
lowing expansion, known as the binomial series : 

(1 + xf = Y) x", if — 1 < x < 1, (35) 

«= o \nj 

where a is an arbitrary real number and (“) — a(a — 1) • • • (a — n + 1 )/«!. 
Bernstein’s theorem is not directly applicable in this case. However we can argue 
as follows: Let f(x) = (1 — x)~ c , where c > 0 and x < 1. Then 

f (n \x ) = c(c + 1) • • • (c + n - 1)(1 - xy c ~ n . 


and hence f (n) (x) > 0 for each n, provided that x < 1. Applying Bernstein’s 
theorem with b = — 1 and r = 2 we find that fix) has a power series expansion 
about the point b = — 1 , convergent for — 1 < x < 1 . Therefore, by Theorem 
9.22, fix) also has a power series expansion about 0, fix) = Y.?L n f (i) (0)x k /k!, 
convergent for —1 < x < 1. But f (k) (0) = if c )( — 1 )* A:!, so 


1 

(1 - xf 



i- 1)***, 


if —1 < x < 1. 



Replacing c by — a and x by — x in (36) we find that (35) is valid for each a < 0. 
But now (35) can be extended to all real a by successive integration. 

Of course, if a is a positive integer, say a = m, then (”) = 0 for n > m, and 
(35) reduces to a finite sum (the Binomial Theorem). 


9.22 ABEL’S LIMIT THEOREM 

t 

If — 1 < x < 1, integration of the geometric series 



= Z*” 

- — n 
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gives us the series expansion 

log (1 - x) = - £ - , (37) 

11=1 n 

also valid for — 1 < x < 1 . If we put x — — 1 in the righthand side of (37), we 
obtain a convergent alternating series, namely, £(— l)" +1 /n. Can we also put 
x — — 1 in the lefthand side of (37)? The next theorem answers this question in 
the affirmative. 

* 4 

Theorem 931 ( Abel’s limit theorem). Assume that we have 

00 

fix) = ^2 a„x", if -r < X < r. (38) 

n = 0 

If the series also converges at x = r, then the limit lim JC _ r _ f{x) exists and we have 

00 

lim fix) = ^2 a J'- 

x~*r~ n = 0 

Proof For simplicity, assume that r — 1 (this amounts to a change in scale). 
Then we are given that f(x) = £a„x” f°r — 1 < x < 1 and that £a„ converges. 
Let us write /(l) = £”=o a„. We are to prove that lim^ x _ f(x) = /(l), or, in 
other words, that /is continuous from the left at x = 1. 

If we multiply the series for f(x) by the geometric series and use Theorem 
9.24, we find 

fix) = J2 c nX n , Where c„ = ^ a k . 

1 — X n = 0 k = 0 

Hence we have 

oo 

fix) -/(l) = (1 - x) ^2 [ c » -/(!)]*"» if -1 < x < 1. (39) 

n = 0 

By hypothesis, limn^^ c n = /(l). Therefore, given e > 0, we can find N such that 
n > N implies |c„ — /(1)| < e/2. If we split the sum (39) into two parts, we get 

fix) - /(l) = (1 - X) J2 l>„ - /(l)]x B + (1 - x) 22 - /(l)]x n . (40) 

n=0 n=N 

Let M denote the largest of the N numbers | c„ — /(1)|, n = 0, 1, 2, . . . , N — 1. 
If 0 < x < 1, (40) gives us 

l/W - /(l)l < (1 - x)NM + (1 - X) - £ x" 

2 n = N 

= (1 - x)NM + (1 - x) - -i— < (1 - x)NM + - . 

2 1 - x 2 
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Now let 6 = e/2 NM. Then 0 < 1 — x < 5 implies |/(jc) — /(1)| < e, which 
means lim,., 1 _ f(x) — /( 1). This completes the proof. 


Example. We may put x = - 1 in (37) to obtain 


10*2 -E 


n= 1 


n 


(See Exercise 8.18 for another derivation of this formula.) 

As an application of Abel’s theorem we can derive the following result on 
multiplication of series: 


Theorem 932 . Let ^°=o a n an d S^=o K be two convergent series and let c n 
denote their Cauchy product . IfYi,n=o c n converges , we have 

00 / 00 \ / 00 \ 

n = 0 \n = 0 J\n = 0 J 

note. This result is similar to Theorem 8.46 except that we do not assume absolute 
convergence of either of the two given series. However, we do assume convergence 
of their Cauchy product. 

Proof. The two power series £cr„x" and both converge for x = 1 , and hence 
they converge in the neighborhood 5(0; 1). Keep |jc| < 1 and write 

E C n x " = ( E V"I E b„x"), 

»=0 \«=0 A " =0 / 

using Theorem 9.24. Now let x -*■ 1 — and apply Abel’s theorem. 


9.23 TAUBER’S THEOREM 

The converse of Abel’s limit theorem is false in general. That is, if / is given by 
(38), the limit f(r— ) may exist but yet the series Y a n r " ma y fail to converge. For 
example, take a„ — (—1)". Then f(x) = 1/(1 + x) if — 1 < x < 1 and f(x) -> ^ 
as x -* 1 — . However, X( — 0" diverges. A. Tauber (1897) discovered that by 
placing further restrictions on the coefficients a n , one can obtain a converse to 
Abel’s theorem. A large number of such results are now known and they are 
referred to as Tauberian theorems. The simplest of these, sometimes called Tauber's 
first theorem, is the following: 

Theorem 933 (Tauber). Let /( x) = a n x" for — 1 < x < 1, and assume that 

lim„^ x na„ — 0. If fix) -* S as x -* 1 — , then Yn=o a n converges and has sum S. 

Proof. Let no„ = Yk=o k\a k \. Then a n -*■ 0 as n -> oo. (See Note following 
Theorem 8.48:) Also, lim„_ *,/(*„) = 5 if x„ = 1 — 1/n. Hence, given e > 0, 
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we can choose N so that n > N implies 

l/OO - S\ < | , a n < | , n\a„\ < | . 

Now let s„ = XZ=o a k- Then, for —1 < x < 1, we can write 

n oo 

s n - S = /(x) - S + 2 - **) - E «***• 

*=0 *=n+l 

Now keep x in (0, 1). Then 

(1 — x*) = (1 — x)(l + x + • • • 4- x* -1 ) < &(1 — x), 
for each k. Therefore, if n > N and 0 < x < 1 , we have 

It 

Is, - S\ < | f(x) - Sj + (1 - x) £ k\a k \ + — . 

k = o 3n(l — x) 

Taking x = x n = 1 — 1/w, we find — S\ < s / 3 + a/3 + a/3 = a. This com- 
pletes the proof. 

note. See Exercise 9.37 for another Tauberian theorem. 

EXERCISES 

Uniform convergence 

9.1 Assume that f„-*f uniformly on S and that each f n is bounded on S. Prove that 
{f n } is uniformly bounded on S. 

9.2 Define two sequences {f n } and {g n } as follows: 

f n (x) = x ^1 + if x e R, « = 1,2,..., 

if x = 0 or if x is irrational, 

+ - if jc is rational, say x = - , b > 0. 

n b 

Let h n (x) = f n (x)g n (x). 

a) Prove that both {/„} and {g n } converge uniformly on every bounded interval. 

b) Prove that {h n } does not converge uniformly on any bounded interval. 

9.3 Assume that f H -*f uniformly on S, g„ -► g uniformly on S. 

a) Prove that / n + g n f + g uniformly on S. 

b) Let h n (x) = f n (x)g n (x\ h(x) = f(x)g(x), if x e 5. Exercise 9.2 shows that the 
assertion h n -► h uniformly on S is, in general, incorrect. Prove that it is correct 
if each fn and each g n is bounded on S. 
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9.4 Assume that /„ -> / uniformly on S and suppose there is a constant M > 0 such 
that \f n (x)\ < M for all * in S and all /i. Let g be continuous on the closure of the disk 
5(0; M) and define h n (x) = g[f„(x)], h(x) = g[f{x)], if x e S. Prove that h„ - h 
uniformly on S. 

9-5 a) Let f n (x) = 1 /(nx + 1) if 0 < x < 1 , n = 1,2,... Prove that {/„} converges 
pointwise but not uniformly on (0, 1). 

b) Let g„(x) = x/(nx + 1) if 0 < x < 1, n = 1, 2, . . . Prove that g n -* 0 uni- 
formly on (0, 1). 

9.6 Let /„(x) = x". The sequence {/„} converges pointwise but not uniformly on [0, 1 ]. 
Let g be continuous on [0, 1 ] with <?(1) = 0. Prove that the sequence {g(x)x n } converges 
uniformly on [0, 1 ]. 

9.7 Assume that/, -*• / uniformly on S, and that each /„ is continuous on S. If x e S, 
let {x„ } be a sequence of points in S such that x„ -> x. Prove that/,(x„) ->• f(x). 

9.8 Let {/,} be a sequence of continuous functions defined on a compact set S and 
assume that {/„} converges pointwise on S to a limit function/. Prove that/, -► / uni- 
formly on S if, and only if, the following two conditions hold: 

i) The limit function /is continuous on S. 

ii) For every e > 0, there exists an m > 0 and a S > 0 such that n > m and 
l/*(*) ~ /Ml < S implies |/* + „(x) - /(jc)| < s for all x in S and all k = 1,2,... 

Hint. To prove the sufficiency of (i) and (ii), show that for each x 0 in S there is a neigh- 
borhood B(x 0 ) and an integer k (depending on x 0 ) such that 

IZM - f{x) I <8 if x e B(x 0 ). 

By compactness, a finite set of integers, say A = k r }, has the property that, for 

each x in S, some k m A satisfies |/(x) — /(x)| < 8. Uniform convergence is an easy 
consequence of this fact. 

9.9 a) Use Exercise 9.8 to prove the following theorem of Dini: If {/,} is a sequence of 

real-valued continuous functions converging pointwise to a continuous limit function 
fon a compact set S, and iff n (x) > f„ +1 (x)for each x in S and every n = 1,2,..., 
then f n -+ f uniformly on S. 

b) Use the sequence in Exercise 9.5(a) to show that compactness of S is essential in 
Dini’s theorem. 

9.10 Let f„(x) = n c x( 1 — x 2 )" for x real and n > 1. Prove that {/,} converges pointwise 
on [0, 1 ] for every real c. Determine those c for which the convergence is uniform on 
[0, 1 ] and those for which term-by-term integration on [0, 1 ] leads to a correct result. 

9.11 Prove that - *) converges pointwise but not uniformly on [0, 1 ], whereas 

Z(- 1 )"x"( 1 - x) converges uniformly on [0, 1 ]. This illustrates that uniform convergence 
°fX/iM along with pointwise convergence of ZIZMI does not necessarily imply uniform 
convergence of Z I /,(■*) I- 

9*12 Assume that g n+ j(x) < g„{x) for each x in T and each n = 1, 2, ... , and suppose 
that g n -> 0 uniformly on T. Prove that Z( ~ l) n+ V„(*) converges uniformly on T. 

9.13 Prove Abel’s test for uniform convergence: Let {g n } be a sequence of real-valued 
functions such that g„ +1 (x) < g n (x) for each x in Tand for every n = 1,2,... If {g„} 
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is uniformly bounded on T and if X/^(jc) converges uniformly on T, then SAW&iW 
also converges uniformly on T. 

9.14 Let f n (x) = x/(\ 4- nx 2 ) if x e R, n = 1,2,... Find the limit function / of the 
sequence {f n } and the limit function g of the sequence {f' n }. 

a) Prove that /'(*) exists for every x but that /'(0) ^ #(0). For what values of x is 
/'(*) = g(x)? 

b) In what subintervals of R does f n -+ f uniformly? 

c) In what subintervals of R does f„->g uniformly? 

9.15 Let f n {x) = (\/n)e- n2x2 if x e R, n = 1,2,... Prove that f„^0 uniformly on R, 
that fn -> 0 pointwise on R, but that the convergence of {/„'} is not uniform on any interval 
containing the origin. 

9.16 Let {/„} be a sequence of real-valued continuous functions defined on [0, 1] and 
assume that f n ^>f uniformly on [0, 1 ]. Prove or disprove 

/•l — l/n /»1 

lim f n (x) dx = fix) dx. 

"-* 00 Jo Jo 


9.17 Mathematicians from Slobbovia decided that the Riemann integral was too compli- 
cated so they replaced it by the Slobbovian integral , defined as follows : If / is a function 
defined on the set Q of rational numbers in [0, 1 ], the Slobbovian integral of f denoted 
by 5(/), is defined to be the limit 


S(f) = lim 

n-*ao 


1 

n 



whenever this limit exists. Let {f n } be a sequence of functions such that S(f„) exists for 
each n and such that f m -+ f uniformly on Q. Prove that {5(A)} converges, that S(f) 
exists, and that 5(A) 5(/) as n -> oo. 

9.18 Let f n {x) — 1/(1 4- n 2 x 2 ) if0<x< 1, n = 1,2, ... Prove that {/„} converges 
pointwise but not uniformly on [0, 1 ]. Is term-by-term integration permissible? 

9.19 Prove that jc//i*(1 4- nx 2 ) converges uniformly on every finite interval in R 
if a > i. Is the convergence uniform on R? 

9.20 Prove that the series £“=i ((-l)"/\/«) sin (l 4- (x/n)) converges uniformly on every 
compact subset of R. 

9.21 Prove that the series Z“=o (x 2n+1 /(2n 4- l) - x n+1 /(2n 4- 2)) converges pointwise 
but not uniformly on [0, l ]. 

9.22 Prove that x a n sin nx and t a n cos nx are uniformly convergent on R if 
Z“=i \ a n\ converges. 

9.23 Let {a n } be a decreasing sequence of positive terms. Prove that the series Yfin sin nx 
converges uniformly on R if, and only if, na n -> 0 as n -> oo . 

9.24 Given a convergent series Y<n=i a n- Prove that the Dirichlet series 2“=i a rP~ s 
converges uniformly on the half-infinite interval 0 < s < 4- oo. Use this to prove that 

lim s _ 0 + Z"=i <V* -S = H"=i a « • 
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9.25 Prove that the series ((s) = 2^° =1 « * converges uniformly on every half-infinite 
interval 1 + h < s < + oo, where h > 0. Show that the equation 


00 


«») - - r 


log n 


n=l 


n 


is valid for each s > 1 and obtain a similar formula for the A:th derivative C (k) (s). 


Mean convergence 

9.26 Let f„(x) = n 3l2 xe~ nlxl . Prove that {f n } converges pointwise to 0 on [—1,1] but 
that l.i.m^^oo f n ^ 0 on [- 1, 1 ]. 

9.27 Assume that {/„} converges pointwise to /on [a, b ] and that l.i.m.^^/ = g on 
[a, b]. Prove that / = g if both /and g are continuous on [a, b]. 

9.28 Letf n (x) = cos" x if 0 < x < n. 

a) Prove that l.i.m^^oo/, = 0 on [0, n] but that does not converge. 

b) Prove that {f n } converges pointwise but not uniformly on [0, n/2], 

9.29 Let/ W (*) = OifO < a: < l//i or if 2//i < * < 1, and let f n (x) = /i if l/n < x < 2/n. 
Prove that {/„} converges pointwise to 0 on [0, 1] but that l.i.m.^^/, ^ 0 on [0, 1]. 


Power series 


9.30 If r is the radius of convergence of 5>„(z - z 0 ) n , where each a n ^ 0, show that 


lim inf 

W -+00 


<*n 

< r < lim sup 

<*n 

a n + 1 


W-+00 

a n + 1 


9.31 Given that the power series Y^n=o a n ztl has radius of convergence 2. Find the radius 
of convergence of each of the following series : 


a) 2 &"> 


n=0 


we 

m — n 


<*n z 


kn 


In (a) and (b), k is a fixed positive integer. 


c>£ 

n— n 



932 Given a power series X”=o a n x " whose coefficients are related by an equation of the 
form 


a H + Aa n _y + Ba n _ 2 = 0 (n = 2,3,...). 

Show that for any x for which the series converges, its sum is 

a o + ( a i + Aa 0 )x 
1 + Ax + Bx 2 

9.33 Let/(jr) = e~ llxl if * * 0,/(0) = 0. 

a) Show that/ <n> (0) exists for all n > 1. 

b) Show that the Taylor’s series about 0 generated by / converges everywhere on R 
but that it represents / only at the origin. 
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9.34 Show that the binomial series (1 + xf = £"=0 ( *) x" exhibits the following be- 
havior at the points x = ±1. \ w / 

a) If x = — 1, the series converges for a > 0 and diverges for a < 0. 

k) If x = 1» the series diverges for a < — 1, converges conditionally for a in the 
interval — 1 < a < 0, and converges absolutely for a > 0. 

9.35 Show that 2>„x" converges uniformly on [0, 1 ] if £a„ converges. Use this fact to 
give another proof of Abel’s limit theorem. 

9.36 If each a n > 0 and if 2>„ diverges, show that '£a n x n -» +coasx-* 1 - . (Assume 
Y. a n x " converges for |x| < 1.) 

9.37 If each a n > 0 and if lim,.,!. Y/V? exists and equals A, prove that Yfin converges 
and has sum A. (Compare with Theorem 9.33.) 

9-38 For each real t, define f,(x) = xe x, /(e x - 1) if x e R, x * 0, / t (0) = 1. 

a) Show that there is a disk B( 0; <5) in which f t is represented by a power series in x. 

b) Define P 0 (t), P t (t), P 2 (t) by the equation 


oo n 

X * 

/»(*) = 22 P n (t) — , if X e 5(0; S), 

n = 0 ^ • 


and use the identity 


00 


00 


-r x 'Er.m-, 

n= 0 n ' n ' 


JC" 
I 


to prove that PJt) = 


•.(0 = Z!.o Q 


^*(0)/"“* This shows that each function P„ is a 


polynomial. These are the Bernoulli polynomials . The numbers B n = P n (0) 

(w = 0, 1, 2, . . . ) are called the Bernoulli numbers . Derive the following further 
properties: 


c) B 0 =1, 5 X = -i, V B k = 0, if n = 2, 3, . . . 

d) P’ n (t) = «i>„_j(0, if n = 1,2,... 

e) P n (t + 1) - P„(t) = nt n ~ l if n = 1, 2, . . . 

f) P„(l - 0 = (- l) n P n (t) g) B 2n+1 = 0 if n = 1, 2, . . . 

h) 1" + 2" + ••• + (* - 1)" = („ = 2, 3, . . .). 

n + 1 
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CHAPTER 10 


THE LEBESGUE INTEGRAL 


10.1 INTRODUCTION 

The Riemann integral J b a f(x) dx, as developed in Chapter 7, is well motivated, 
simple to describe, and serves all the needs of elementary calculus. However, this 
integral does not meet all the requirements of advanced analysis. An extension, 
called the Lebesgue integral , is discussed in this chapter. It permits more general 
functions as integrands, it treats bounded and unbounded functions simultaneously, 
and it enables us to replace the interval [ a, b~] by more general sets. 

The Lebesgue integral also gives more satisfying convergence theorems. If a 
sequence of functions {f n } converges pointwise to a limit function f on [ a, b], it 
is desirable to conclude that 


lim f n (x) dx = 

"-* 00 Ja 



f(x) dx 


with a minimum of additional hypotheses. The definitive result of this type is 
Lebesgue’ s dominated convergence theorem , which permits term-by-term integra- 
tion if each {f n } is Lebesgue-integrable and if the sequence is dominated by a 
Lebesgue-integrable function. (See Theorem 10.27.) Here Lebesgue integrals are 
essential. The theorem is false for Riemann integrals. 

In Riemann’s approach the interval of integration is subdivided into a finite 
number of subintervals. In Lebesgue’ s approach the interval is subdivided into 
more general types of sets called measurable sets. In a classic memoir, Integrate, 
longueur , aire , published in 1902, Lebesgue gave a definition of measure for point 
sets and applied this to develop his new integral. 

Since Lebesgue’s early work, both measure theory and integration theory have 
undergone many generalizations and modifications. The work of Young, Daniell, 
Riesz, Stone, and others has shown that the Lebesgue integral can be introduced 
by a method which does not depend on measure theory but which focuses directly 
on functions and their integrals. This chapter follows this approach, as outlined 
in Reference 10.10. The only concept required from measure theory is sets of 
measure zero, a simple idea introduced in Chapter 7. Later, we indicate briefly 
how measure theory can be developed with the help of the Lebesgue integral. 
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10.2 THE INTEGRAL OF A STEP FUNCTION 

The approach used here is to define the integral first for step functions, then for a 
larger class (called upper functions) which contains limits of certain increasing 

sequences of step functions, and finally for an even larger class, the Lebesgue- 
integrable functions. 

We recall that a function s , defined on a compact interval [ a , b\ is called a 
step function if there is a partition P = {x 0 , x u ..., x n } of [a, b) such that 5 is 
constant on every open subinterval, say 


s ( x ) = c k if* e **). 

A step function is Riemann-integrable on each subinterval [jc*_ 15 x k ] and its 
integral over this subinterval is given by 


rx k 


Jx k - 1 


s(x) dx = c k (x k - **_,), 


regardless of the values of s at the endpoints. The Riemann integral of s over 
[a, h] is therefore equal to the sum 


*b « 

s(x) dx = c k (x k - *»_,). ( 1 ) 

Jo *-i 

note. Lebesgue theory can be developed without prior knowledge of Riemann 
integration by using equation (1) as the definition of the integral of a step function. 
It should be noted that the sum in (1) is independent of the choice of P as long as s 
is constant on the open subintervals of P. 

It is convenient to remove the restriction that the domain of a step function be 
compact. 

Definition 10.1. Let I denote a general interval ( bounded , unbounded, open, closed, 
or half-open). A function s is called a step function on I if there is a compact 
subinterval \a, h] of I such that s is a step function on [a, h] and s(x) = 0 

ifxel- [a, b~\. The integral of s over I, denoted by J, s(x) dx or by J, s, is defined 
to be the integral of s over [a, b], as given by (1). 

There are, of course, many compact intervals [a, b~] outside of which s vanishes, 
but the integral of s is independent of the choice of [a, b\ 

The sum and product of two step functions is also a step function. The follow- 
ing properties of the integral for step functions are easily deduced from the fore- 
going definition : 


(s + t) = 

/% 

s + 

t. 

* /» 

cs = c 

J f % 

I 

I 



r /• 


5 < \ t 




«/ 


for every constant c. 


I 


I 


if s(jc) < t(x) for all x in I. 
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Also, if / is expressed as the union of a finite set of subintervals, say 
1 = U?-l I >r> K ], where no two subintervals have interior points in common, then 

f s(x) dx = ^2 I s(x) dx. 

J/ ' =1 Ja r 


10.3 MONOTONIC SEQUENCES OF STEP FUNCTIONS 

A sequence of real-valued functions {/„} defined on a set S is said to be increasing 
on S if 

f n (x) < f n+ i(x) for all x in S and all n. 

A decreasing sequence is one satisfying the reverse inequality. 

note. We remind the reader that a subset T of R is said to be of measure 0 if, 
for every e > 0, T can be covered by a countable collection of intervals, the sum 
of whose lengths is less than 6. A property is said to hold almost everywhere on a 
set S (written : a.e. on S ) if it holds everywhere on S except for a set of measure 0. 

notation. If {/„} is an increasing sequence of functions on S such that f n ~* f 
almost everywhere on S, we indicate this by writing 

/„ s f a.e. on S. 

Similarly, the notation /„ \ / a.e. on S means that {/„} is a decreasing sequence 
on S which converges to /almost everywhere on S. 

The next theorem is concerned with decreasing sequences of step functions on 
a general interval I. 


Theorem 10.2. Let {$„} be a decreasing sequence of nonnegative step functions such 
that s„ n 0 a.e. on an interval I. Then 

* 

lim s n — 0. 

n~* oo j 

V 1 


Proof. The idea of the proof is to write 



+ 



s 


n » 


where each of A and B is a finite union of intervals. The set A is chosen so that 
in its intervals the integrand is small if n is sufficiently large. In B the integrand 
need not be small but the sum of the lengths of its intervals will be small. To carry 

out this idea we proceed as follows. 

There is a compact interval [a, b~] outside of which 5, vanishes. Since 


0 < ■*„(*) < $,(*) for all x in I, 


each s n vanishes outside [a, b\ Now s„ is constant on each open subinterval of 
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some partition of [ a , b\ Let D n denote the set of endpoints of these subintervals, 
and let D = U® = t D„. Since each D„ is a finite set, the union D is countable and 
therefore has measure 0. Let E denote the set of points in [a, b ] at which the 
sequence {$„} does not converge to 0. By hypothesis, E has measure 0 so the set 

F = D yj E 

also has measure 0. Therefore, if e > 0 is given we can cover F by a countable 
collection of open intervals F u F 2 , . . . , the sum of whose lengths is less than e. 

Now suppose x e [a, F] - F. Then x $ E, so s„(x) -► 0 as n -► oo. Therefore 
there is an integer N = N(x) such that s N (x) < e. Also, x $ D so x is interior to 
some interval of constancy of s N . Hence there is an open interval B(x) such that 
j jv(* ) < 6 for all t in B(x). Since {$„} is decreasing, we also have 

•s„(0 < e for all n > N and all t in B(x). (2) 

The set of all intervals B(x) obtained as x ranges through [a, F] — F, together 
with the intervals F u F 2 , , form an open covering of [a, F]. Since [a, b ] is 
compact there is a finite subcover, say 


[>, i]c(| B( Xi ) u 0 F r . 

i= 1 r= 1 


Let N 0 denote the largest of the integers . . . , N(x p ). 

P 

s n (t) < e for all n > N 0 and all t in y 

i= 1 

Now define A and B as follows : 


From (2) we see that 
*(*/). (3) 


B 



r- 1 


A = [a, b ] - B. 


Then A is a finite union of disjoint intervals and we have 






JB 


S 


rr 


First we estimate the integral over B. Let M be an upper bound for on [a, 6], 
Since {*„} is decreasing, we have $„(*) < ^(x) < M for all x in [a, F], The sum 
of the lengths of the intervals in B is less than e, so we have 




JB 


s„ < Me. 


Next we estimate the integral over A. Since A c (Jf =1 B(x ( ), the inequality 
in (3) shows that s„(x) < e if x e A and n > N 0 . The sum of the lengths of the 
intervals in A does not exceed b — a, so we have the estimate 


f s„< (b - a)e if n > N 0 . 
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The two estimates together give us J 7 s n < (M + b — a)s if n > N 0 , and this 
shows that lim^^^ J 7 s„ = 0. 


Theorem 10.3. Let {t n } be a sequence of step functions on an interval I such that: 

a) There is a function f such that t n ? f a.e. on /, 
and 

b) the sequence {J 7 t n } converges. 

Then for any step function t such that t(x) < f(x) a.e. on /, we have 



Proof. Define a new sequence of nonnegative step functions {s n } on / as follows : 


sm - {; (x) - 


if t(x) > t„(x), 
if t(x) < t„(x). 


Note that s n (x) = max {?(x) — r„(x), 0}. Now {s n } is decreasing on / since {/„} is 
increasing, and $„(*) -*• max {t(x) — f(x), 0} a.e. on I. But t(x) < f(x) a.e. on I, 
and therefore s n s, 0 a.e. on I. Hence, by Theorem 10.2, lim,^^ jj s n = 0. But 
5 n (x) > t(x) — t n (x) for all x in /, so 



Now let n -*■ oo to obtain (4). 


10.4 UPPER FUNCTIONS AND THEIR INTEGRALS 

Let 5(7) denote the set of all step functions on an interval I. The integral has been 
defined for all functions in 5(7). Now we shall extend the definition to a larger 
class U(I) which contains limits of certain increasing sequences of step functions. 
The functions in this class are called upper functions and they are defined as follows : 

Definition 10.4. A real-valued function f defined on an interval I is called an upper 
function on 7, and we write f e U(I), if there exists an increasing sequence of step 
functions {.?„} such that 

a) s„ S f a.e. on 7, 
and 

b) lim n _ +00 J 7 s n is finite. 

The sequence {£„} is said to generate f. The integral of f over I is defined by the 
equation 
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note. Since {J , s n } is an increasing sequence of real numbers, condition (b) is 
equivalent to saying that {Jj is bounded above. 


The next theorem shows that the definition of the integral in (5) is unambiguous. 


Theorem 10.5 . 

/. Then 


Assume f e U(J) and let {i 1 ,,} and {t m } be two sequences generating 

lim J s„ = lim f t m . 

n~* oo J j m-+ oo j 


Proof. The sequence {t m } satisfies hypotheses (a) and (b) of Theorem 10.3. Also, 
for every n we have 


so (4) gives us 


•s„(x) < fix) a.e. on /, 






I 


S„ < 




lim 


m-+ oo 



Since this holds for every n, we have 


lim 

n-+ oo 



< 



m-+ oo 


JI 



The same argument, with the sequences and { t m } interchanged, gives the reverse 

inequality and completes the proof. 


It is easy to see that every step function is an upper function and that its 
integral, as given by (5), is the same as that given by the earlier definition in 

Section 10.2. Further properties of the integral for upper functions are described 
in the next theorem. 


Theorem 10.6. Assume fe U(l) and g e U(I). Then : 
a ) (/ + g) e U(l) and 


f ( f+9 ) = 




I 


f + 



9- 


b) cf e U (I) for every constant c > 0, and 


i cf ' c i f 

c) J - // ^ J/ g if fix) < gix) a.e. on I. 

note. In part (b) the requirement c > 0 is essential. There are examples for 
which / e £/(/) but —ft f/(/). (See Exercise 10.4.) However, if f e Uil) and if 
s e Si J), then/ — s e (/(/) since / — s = / + (— s). 

Proof. Parts (a) and (b) are easy consequences of the corresponding properties 
for step functions. To prove (c), let {^ m } be a sequence which generates f and let 
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{t n } be a sequence which generates g . Then s m /* f and t n ? g a.e. on /, and 


lim 

m~* oo 



But for each m we have 


lim 

n~+ oo 



9- 


s m (x) < f(x) < g(x) = lim t n (x) a.e. on /. 

n~+ oo 

Hence, by Theorem 10.3, 

I s m < lim I f„ = I g. 

Ji ■-*«> J/ Ji 

Now, let m -* oo to obtain (c). 

The next theorem describes an important consequence of part (c). 

Theorem 10.7. Iff e U(I) andg e Uil), and if f(x) = g(x) almost everywhere on I, 
then J/ / = J/ g. 

Proof. We have both inequalities fix) < g(x) and g(x) < f(x) almost everywhere 
on I, so Theorem 10.6 (c) gives J,/ < g and j r g < ]// 

Definition 10.8. Let f and g be real-valued functions defined on I. We define 
max if g) and min if g) to be the functions whose values at each x in I are equal to 
max {fix), gix)} and min {/(x), #(x)}, respectively. 

The reader can easily verify the following properties of max and min : 

a) max {f g) + min (/, g) = / + g, 

b) max if + h, g + h) = max (/, g) + h, and min if + h, g + h) = min (/, g) + h. 
If/, / f a.e. on /, and if g n s g a.e. on I, then 

c) max {f n , g n ) / max (/, g) a.e. on I, and min (/„, g n ) s min (/, g) a.e. on I. 


Theorem 10.9. Iffe C/(/) andg e U{I), then max (/ g) e U{I) and min (/, g) e Uil). 


Proof. Let {s n } and {?„} be sequences of step functions which generate / and g, 
respectively, and let u„ = max {s„, t n ), v„ = min {s„, t„). Then u„ and v„ are step 
functions such that u„ s max (/ g) and v n / min (/, g) a.e. on I. 

To prove that min (/, g) e Uil), it suffices to show that the sequence {J/ 1 >„} is 
bounded above. But v„ = min is„, t„) < f a.e. on I, so J/ v„ < j/ /. Therefore the 
sequence {j z v„} converges. But the sequence {J/ t/ n } also converges since, by 
property (a), w„ = s„ + t n — v n and hence 




J* v„ -* | / + | g - j min (/ g). 


The next theorem describes an additive property of the integral with respect 
to the interval of integration. 
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Theorem 10,10, Let I be an interval which is the union of two subintervals , say 
I = 1 1 u / 2 , where f and I 2 have no interior points in common . 

a) Iffe U(I ) aw*/ iff > 0 a.e. /, thenfe U(I t ) 9 fe U(I 2 ), 

[/=[/+(/ ( 6 ) 
J/ J/l J/2 

b) Assume f x e U(I 1 ),f 2 e U(I 2 ), and let f be defined on I as follows: 


Then f e U(I) and 



if x e /„ 
if x e I — / t . 



Proo/ If {5„} is an increasing sequence of step functions which generates / on I, 
kt s n ( x ) = max KW, 0} for each x in I. Then } is an increasing sequence of 
nonnegative step functions which generates f on I (since / > 0). Moreover, for 
every subinterval 7 of / we have jj s+ < J, 5„ + < J, / so { 5 +} generates/on J. Also 

f - f < + f * 

Jl Jl 1 J/2 

so we let n —*■ 00 to obtain (a). The proof of (b) is left as an exercise. 

note. There is a corresponding theorem (which can be proved by induction) for 
an interval which is expressed as the union of a finite number of subintervals, no 
two of which have interior points in common. 


10.5 RIEMANN-INTEGRABLE FUNCTIONS AS EXAMPLES OF UPPER 
FUNCTIONS 

The next theorem shows that the class of upper functions includes all the Riemann- 
integrable functions. 

Theorem 10.11. Let f be defined and bounded on a compact interval [a, b], and 
assume thatf is continuous almost everywhere on \a, b\ Then f e U(\a, bf) and the 
integral off, as a function in t/([a, b]), is equal to the Riemann integral \ b a f{x) dx. 

Proof Let P n = {x 0 , x u , x 2 »} be a partition of [a, 6] into 2" equal sub- 
intervals of length ( b — a)/ 2". The subintervals of P n+l are obtained by bisecting 
those of P„. Let 

m k = inf {/(*) : x e [**_!, x*]} for 1 < k < 2", 
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and define a step function s n on [ 0 , b ] as follows : 

s„(x) = m k if x k _ j < x < x k , sfa) = m v 

Then j b (x) < f(x) for all x in [ a , b~]. Also, {sj is increasing because the inf off 
in a subinterval of [x*_i, x*] cannot be less than that in [x k _ 1; x*]. 

Next, we prove that j„(x) — ► fix) at each interior point of continuity off. Since 
the set of discontinuities of f on [a, b~\ has measure 0, this will show that s„ — ► f 
almost everywhere on [a, If /is continuous at x, then for every e > 0 there is 
a 8 (depending on x and on e) such that /(x) — e < f(y) < fix) + e whenever 
x — 8 < y < x + 8. Let m(8) = inf {/(y): y e (x - 8, x + 5)}. Then 
fix) - e < m(8), so /(x) < m(8) + s. Some partition P N has a subinterval 
[x*-!, x t ] containing x and lying within the interval (x — 8, x + 8). Therefore 

5jy(x) = m k < f(x) < mi8) + e <, m k + e = s N ix) + s. 

But sfx) < fix) for all n and ^(x) < j„(x) for all n > N. Hence 

5„(x) < fix) < j„(x) + e if n > N, 


which shows that sfx) -*■ fix) as n -* oo. 

The sequence of integrals { s„) converges because it is an increasing sequence, 
bounded above by M(b — a), where M = sup {fix ) : x e [a, b~\). Moreover, 



= L(P„, /), 


where L(P„,/) is a lower Riemann sum. Since the limit of an increasing sequence 
is equal to its supremum, the sequence (Jjj s n } converges to the Riemann integral 
off over [a, b). (The Riemann integral J„/(x) dx exists because of Lebesgue’s 
criterion, Theorem 7.48.) 


note. As already mentioned, there exist functions /in t/(7) such that — / £ U(I). 
Therefore the class t/(7) is actually larger than the class of Riemann-integrable 
functions on /, since — / e R on / if/e R on /. 


10.6 THE CLASS OF LEBESGUE-INTEGRABLE FUNCTIONS ON A 
GENERAL INTERVAL 


If u and v are upper functions, the difference u — v is not necessarily an upper 
function. We eliminate this undesirable property by enlarging the class of inte- 
grable functions. 


Definition 10.12. We denote by Lil) the set of all functions f of the form f = u - v, 
where u e Uil) and v e t/(7). Each function f in Lil) is said to be Lebesgue- 
integrable on 7, and its integral is defined by the equation 


t • 





u 


JI 



V. 


I 


( 7 ) 
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If/e L(I) it is possible to write / as a difference of two upper functions u — v 
in more than one way. The next theorem shows that the integral of /is independent 
of the choice of u and v. 


Theorem 10.13. 
Then 


Let u, v, u u and v t be functions in U(I) such that u — v = u t 



( 8 ) 


Proof. The functions u + and u t + v are in £/(/) and u + v x — u t + v. 
Hence, by Theorem 10.6(a), we have U u + h v i = f/ «i + Jj v, which proves (8). 


note. If the interval / has endpoints a and b in the extended real number system R*, 
where a <, b, we also write 



or 


rb 


f(x) dx 


for the Lebesgue integral f,/. We also define \ a b f = - \ b a f. 

If [a, b~\ is a compact interval, every function which is Riemann-integrable on 
[a, b] is in U(\a, h]) and therefore also in L(\a, 6]). 


10.7 BASIC PROPERTIES OF THE LEBESGUE INTEGRAL 
Theorem 10.14. Assume f e L(T) and g e L(J). Then we have: 

a ) i a f + bg) e L(T) for every real a and b, and 

J (flf + bg) = a j* / + b f g. 

b) f// > 0 if f(x) > 0 a.e. on I. 

c) \if>Ug if A x ) ^ d(x) a.e. on I. 

d ) \if= \i9 if fix) = gix) a.e. on I. 

Proof. Part (a) follows easily from Theorem 10.6. To prove (b) we write 

f — u — v , where u e U(l) and v e U{I). Then uix) > v(x) almost everywhere 
on I so, by Theorem 10.6(c), we have j/ u > Jj v and hence 

Part (c) follows by applying (b) to f - g, and part (d) follows by applying (c) 
twice. 

Definition 10.15. Iff is a real-valued function, its positive part, denoted by f + , and 
its negative part, denoted by f~, are defined by the equations 

f + = max if 0), f~ = max (-/ 0). 



262 


The Lebesgue Integral 


Th. 10.16 





Figure 10.1 


Note that f + and / are nonnegative functions and that 

f=f + -r, \f\=r+r- 

Examples are shown in Fig. 10.1. 

Theorem 10.16. If f and g are in L(I), then so are the functions f + , f~, \f\, 
max (/ g) and min (/ g). Moreover, we have 



Proof Write f = u — v, where u e 1/(1) and v e U(l). Then 

f + = max (« — v, 0) = max ( u , v) — v. 

But max ( u , v) e U(I), by Theorem 10.9, and v e U(l), so / + e L(I). Since 
/“ = / + - f, we see that/ - e L(I). Finally, |/| = / + + /", so |/| e £(/). 
Since — |/(x)| < f(x) < |/(jc)| for all x in / we have 

- 1 1/1 £ 1 / £ 1 l/l ’ 

which proves (9). To complete the proof we use the relations 

max (f g) = \{f + g + \f - g\), min (/ g) = W + g - \f - g\\ 

The next theorem describes the behavior of a Lebesgue integral when the inter- 
val of integration is translated, expanded or contracted, or reflected through the 
origin. We use the following notation, where c denotes any real number: 

I + c = {x 4- c : x e /}, cl = {cx : x e /}. 

Theorem 10.17 . Assume f e L(I). Then we have: 

a) Invariance under translation. If g{x ) = f(x — c) for x in I 4- c, then g e L(I -f c), 
and 



b) Behavior under expansion or contraction. If g(x) = f(x/c) for x in cl, where 
c > 0, then g e L(cl ) and 
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c) Invariance wider reflection. If g(x) = f(—x) for x in —I, then g e L(—I) and 



note. If / has endpoints a < b, where a and b are in the extended real number 
system R*, the formula in (a) can also be written as follows : 

( *b + c Pb 

f(x — c) dx = f(x) dx . 

w <i+c Ja 

Properties (b) and (c) can be combined into a single formula which includes both 
positive and negative values of c: 

fcb Cb 

I f{x(c ) dx = |c| i f(x) dx if c 0. 

Jca J a 

Proof In proving a theorem of this type, the procedure is always the same. First, 
we verify the theorem for step functions, then for upper functions, and finally for 
Lebesgue-integrable functions. At each step the argument is straightforward, so 
we omit the details. 

Theorem 10.18. Let I be an interval which is the union of two subintervals, say 
I = I t u I 2 , where I v and I 2 have no interior points in common. 

a) If f e L(I), thenfe L{f),fe L(I 2 ), and 

b) Assume f t e L(I t ),f 2 e L(I 2 ), and let fbe defined on I as follows: 

f( x \ = f/i(*) i f xe * i> 

U(x) if x e I — /j. 

Thenfe Iff) and^f = J,./, + j t J 2 . 

Proof. Write f = u — v where u e U(I) and v e U(I). Then u = u + — u~ and 
v = v + — v~, so / = u + + v~ — {u~ + » + ). Now apply Theorem 10.10 to 
each of the nonnegative functions « + + v~ and u~ + v + to deduce part (a). The 
proof of part (b) is left to the reader. 

note. There is an extension of Theorem 10.18 for an interval which can be 
expressed as the union of a finite number of subintervals, no two of which have 
interior points in common. The reader can formulate this for himself. 

We conclude this section with two approximation properties that will be 
needed later. The first tells us that every Lebesgue-integrable function / is equal 
to an upper function u minus a nonnegative upper function v with a small integral. 
The second tells us that / is equal to a step function s plus an integrable function 
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g with a small integral. More precisely, we have: 


Theorem 10.19. Assume f e L(I) and let e > 0 be given. Then: 

a) There exist functions u and v in U(I) such that f = u — v, where v is non- 
negative a.e. on I and J 7 v < e. 

b) There exists a step function s and a function g in L(J) such that f = s + g, 
where J 7 \g\ < e. 


Proof. Since /e L(J), we can write / = m 7 — v t where u t and v t are in U(I). 
Let {?„} be a sequence which generates v x . Since J 7 t n -> j 7 v t , we can choose N so 
that 0 < j 7 (»j — t N ) < e. Now let v = v t — t N and u = u t — t N . Then both 
u and v are in U(J ) and u — v — u t — v x = /. Also, v is nonnegative a.e. on I 
and Jj v < e. This proves (a). 

To prove (b) we use (a) to choose u and v in U(I) so that v > 0 a.e. on I, 


f - u — v 


and 


0 < v 

J/ 


Now choose a step function s such that 0 <, J 7 (u — s) < e/2. 

f = u — v = s + (u — s) — v = s + g, 
where g = (u — s) — v. Hence g e L(I) and 


Then 


I 1 - J| + 1 


ll ^ ^ 

N < - + - = £• 
2 2 


10.8 LEBESGUE INTEGRATION AND SETS OF MEASURE ZERO 

The theorems in this section show that the behavior of a Lebesgue-integrable 
function on a set of measure zero does not alfect its integral. 

Theorem 10.20. Let f be defined on I. If f = 0 almost everywhere on I, then 
fe L{I) and f 7 / = 0. 

Proof. Let s n (x) — 0 for all x in I. Then {s„} is an increasing sequence of step 
functions which converges to 0 everywhere on I. Hence {$„} converges to/ almost 
everywhere on I. Since J 7 s n = 0 the sequence converges. Therefore /is 

an upper function, so/e L(I) and \ t f = lim^^ J, s n = 0. 

Theorem 10.21. Let f and g be defined on I. Iff e L(T) and iff = g almost every- 
where on /, then g e L(I) and \,f = J, g. 

Proof. Apply Theorem 10.20 to f - g. Then f - ge L(I) and J, (/ - g) = 0. 
Hence g =/- (f-g)e L(I) and \ l9 = f,/ - (/ - g) = J 7 /. 

Example. Define / on the interval [0, 1 ] as follows: 

r/ x \ _ 1 1 if x is rational 
1 0 if x is irrational. 
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Then / = 0 almost everywhere on [0, 1 ] so / is Lebesgue-integrable on [0, 1 ] and its 
Lebesgue integral is 0. As noted in Chapter 7, this function is not Riemann-integrable 
on [0, 1]. 

note. Theorem 10.21 suggests a definition of the integral for functions that are 
defined almost everywhere on I. If g is such a function and if g{x) = f(x) almost 
everywhere on /, where / e L(I), we say that g e L(l) and that 



10.9 THE LEVI MONOTONE CONVERGENCE THEOREMS 


We turn next to convergence theorems concerning term-by-term integration of 
monotonic sequences of functions. We begin with three versions of a famous 
theorem of Beppo Levi. The first concerns sequences of step functions, the second 
sequences of upper functions, and the third sequences of Lebesgue-integrable 
functions. Although the theorems are stated for increasing sequences, there are 
corresponding results for decreasing sequences. 


Theorem 10.22 (Levi theorem for step functions). Let {$„} be a sequence of step 
functions such that 


a) { j„} increases on an interval I, and 

b) lim n _ 00 j, s n exists. 

Then {j„} converges almost everywhere on I to a limit function f in U(I), and 



Proof. We can assume, without loss of generality, that the step functions s n are 
nonnegative. (If not, consider instead the sequence {j b — Ji}. If the theorem is 
true for — jJ, then it is also true for {j„}.) Let D be the set of x in / for which 
(j„(x)} diverges, and let e > 0 be given. We will prove that D has measure 0 by 
showing that D can be covered by a countable collection of intervals, the sum of 
whose lengths is < e. 

Since the sequence {J, j„} converges it is bounded by some positive constant 
M. Let 



where [y] denotes the greatest integer <y. Then {/„} is an increasing sequence of 
step functions and each function value t„(x) is a nonnegative integer. 

If converges, then {j„(x)} is bounded so {/„(*)} is bounded and hence 

t„ + j(x) = t„(x) for all sufficiently large n, since each t n (x) is an integer. 

If (j„(x)} diverges, then {/„(*)} also diverges and t„ + j (x) — t„(x) > 1 for 
infinitely many values of n. Let 


D n = {x:xel and t„ +1 (x) — f„(x) > 1}. 
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Then D„ is the union of a finite number of intervals, the sum of whose lengths we 
denote by |Z>„|. Now 

D £ 0 D „, 

n= 1 

so if we prove that t \D n \ < e , this will show that D has measure 0. 

To do this we integrate the nonnegative step function t n+1 — t n over I and 
obtain the inequalities 


f (*n+ 1 


/ 





Hence for every m ;> 1 we have 


E IA.I < 


n— 1 





Therefore y.”_ , |Z)„| < e/2 < e, so D has measure 0. 

This proves that {$„} converges almost everywhere on 7. Let 



limn-,* s n (x) 
0 


if x e I - D, 
if x e D. 


Then / is defined everywhere on / and s„ -*• / almost everywhere on /. Therefore, 
/ e C/(7) and f,/=linv. 


Theorem 10.23 (Levi theorem for upper functions). Let {/„} be a sequence of upper 
functions such that 

a ) {/„} increases almost everywhere on an interval /, 
and 

b) lim^^ \jf n exists. 

Then {/„} converges almost everywhere on I to a limit function f in U(I), and 



lim 

n- * oo 



Proof. For each k there is an increasing sequence of step functions {$„ J which 
generates f k . Define a new step function t„ on I by the equation 

in max s n ( 2 (-x 1 ) , . > . , s n n (-v)} . 

Then {;„} is increasing on I because 

i n + \ C*") max {j„+i,i(x), . . . , •S’n+l.n+lC^)} — max {^n, 1 (-^"), • • • , S n ,n + 1 (-V)} 

> max {*„,,(*), . . . , = t„(x). 
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But s n>t (x) < f k (x) and {j*} increases almost everywhere on /, so we have 


/„(*) ^ max {/,(*), . . . ,/„(*)} = f„(x) 



almost everywhere on /. Therefore, by Theorem 10.6(c) we obtain 


* 





But, by (b), {J //„} is bounded above so the increasing sequence {J 7 /„} is also 
bounded above and hence converges. By the Levi theorem for step functions, 
{/„} converges almost everywhere on I to a limit function /in [/(/), and J t f = 
li m n->oo jj *»• We prove next that/, -► /almost everywhere on /. 

The definition of t n (x) implies ^(x) < t n {x) for all k < n and all x in J. 
Letting n -> oo we find 


f k (x ) < f(x) almost everywhere on /. (12) 

Therefore the increasing sequence {/*(*)} is bounded above by f(x) almost every- 
where on /, so it converges almost everywhere on I to a limit function g satisfying 
g(x) < f(x) almost everywhere on /. But (10) states that t n {x ) < f n (x ) almost 
everywhere on I so, letting n -* co, we find f(x) < g(x) almost everywhere on I. 
In other words, we have 


lim f n (x) = f(x) almost everywhere on I. 

n~* oo 

Finally, we show that J ,/ = lim B _ 00 J, /„. Letting n -» oo in (1 1) we obtain 

f / < lim [ /„. (13) 

JI n - >Q0 Ji 

Now integrate (12), using Theorem 10.6(c) again, to get J ,/* < J 7 /. Letting 

k co we obtain lim^.,^ J t f k < J if which, together with (13), completes the 
proof. 

note. The class [/(/) of upper functions was constructed from the class S(I) of 
step functions by a certain process which we can call P. Beppo Levi’s theorem 
shows that when process P is applied to [/(/) it again gives functions in U(I). The 
next theorem shows that when P is applied to L(I) it again gives functions in 
L(I ). 

Theorem 10.24 (Levi theorem for sequences of Lebesgue-integrable functions). Let 

{/«} t> e a sequence of functions in L(I) such that 

a ) {/»} increases almost everywhere on /, 
and 

b) lim^^ j t f n exists. 
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Then {/„} converges almost everywhere on I to a limit function f in L{I ), and 



lim 

n~* oo 



We shall deduce this theorem from an equivalent result stated for series of 
functions. 


Theorem 10.25 (Levi theorem for series of Lebesgue-integrable functions). Let 

{g n } be a sequence of functions in L{I) such that 


a) each g n is nonnegative almost everywhere on /, 
and 

b) the series J/ 9 „ converges . 


Then the series j g n converges almost everywhere on I to a sum function g in 
L{I ), and we have 


m 00 00 /• 

Q I Qn ^ drr 

J/ J/ b=i " =i J/ 


(14) 


Proof. Since g„ e L(I), Theorem 10.19 tells us that for every e > 0 we can write 


9n = »n~ v„. 


where u n e [/(/), v„ e U(I), t>„ > 0 a.e. on /, and Jj v„ < e. 
corresponding to e = (£)". Then 


«» = 9n + 


IP 


where 


* 

K 


< (*)' 


Choose u„ and v n 


The inequality on j 7 v„ assures us that the series Y”-, J, v„ converges. Now 
u„ ^ 0 almost everywhere on I, so the partial sums 


U n (x ) = 2 «*(*) 

k=l 

form a sequence of upper functions { U n ) which increases almost everywhere on /. 
Since 

= ]C U k = 

*=1 J / 

the sequence of integrals {J 7 U„) converges because both series Y”_ , g k and 
Y" _ i Jj v k converge. Therefore, by the Levi theorem for upper functions, the 
sequence { U„} converges almost everywhere on I to a limit function U in U(I), 
and U = lim^^ J 7 U„. But 

U. = t f 

*-» Ji 
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Similarly, the sequence of partial sums {V n } given by 


K(x) = £ v k(x) 

k=l 

converges almost everywhere on / to a limit function V in U(I) and 



Therefore U — V e L(T) and the sequence {Y"= 1 ffk) = {U n — V n } converges 
almost everywhere on I to U - V. Let g = U - V. Then g e L(f) and 

This completes the proof of Theorem 10.25. 

Proof of Theorem 10.24. Assume {/„} satisfies the hypotheses of Theorem 10.24. 
Let 9 1 = A and let g n = f n - /„_ t for n > 2, so that 

n 

fn 9k • 

k=l 

Applying Theorem 10.25 to {&,}, we find that 1 g n converges almost everywhere 
on / to a sum function g in L(I), and Equation (14) holds. Therefore f„-+g 
almost everywhere on / and Jj g = lim B ^ 0O \jf n . 

In the following version of the Levi theorem for series, the terms of the series 
are not assumed to be nonnegative. 

Theorem 10.26. Let {g„} be a sequence of functions in L(I) such that the series 

OO A 

£ \9n\ 

" =1 Jj 

is convergent . Then the series 9 n converges almost everywhere on I to a sum 
function g in L(I) and we have 

r °° oo /» 

I 9n ^ I 9n • 

J/n=l »=1 J/ 

Proof Write g n = gf - g~ and apply Theorem 10.25 to the sequences {g* } 
and {g„ } separately. 

The following examples illustrate the use of the Levi theorem for sequences. 
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Example 1. Let f(x) = x s for x > 0, /( 0) = 0. Prove that the Lebesgue integral 
Jo /(*) dx exists and has the value 1/(5 + 1) if s > — 1. 

Solution . If 5 > 0, then / is bounded and Riemann-integrable on [0, 1 ] and its Riemann 
integral is equal to 1/(5 + 1). 

If 5 < 0, then / is not bounded and hence not Riemann-integrable on [0, 1 ]. Define 
a sequence of functions {f n } as follows: 



if x > 1 /«, 
ifO < x < 1 /n. 


Then {f n } is increasing and f m -+ f everywhere on [0, 1]. Each f n is Riemann-integrable 
and hence Lebesgue-integrable on [0, 1 ] and 



/„(*) dx = 



1 

5+1 




If 5 + 1 > 0, the sequence {Jo/,} converges to 1/(5 + 1). Therefore, the Levi theorem 
for sequences shows that Jo /exists and equals 1/(5 + 1). 

Example 2. The same type of argument shows that the Lebesgue integral Jo c~ x x y ~ 1 dx 
exists for every real y > 0. This integral will be used later in discussing the Gamma 
function. 


10.10 THE LEBESGUE DOMINATED CONVERGENCE THEOREM 

Levi’s theorems have many important consequences. The first is Lebesgue’s 
dominated convergence theorem, the cornerstone of Lebesgue’s theory of inte- 
gration. 

Theorem 10.27 (Lebesgue dominated convergence theorem). Let {/„} be a sequence 
of Lebesgue-integrable functions on an interval /. Assume that 

a) {/„} converges almost everywhere on I to a limit function f 
and 

b) there is a nonnegative function g in L(I) such that, for all n > 1, 

\f„(x)\ < g(x) a.e. on /. 

Then the limit function f e L(I ), the sequence {jj /„} converges and 

f/= lim [f n . (15) 

J/ "-* 00 J/ 

note. Property (b) is described by saying that the sequence {/„} is dominated by 
g almost everywhere on /. 

Proof The idea of the proof is to obtain upper and lower bounds of the form 

g„(x) < f n (x) < G„(x) (16) 
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where {g„} increases and {G„} decreases almost everywhere on I to the limit function 
/. Then we use the Levi theorem to show that / e L(I) and that J,/ = 
lim,,..^ J 7 g n — lim,,.,^ J 7 G B , from which we obtain (15). 

To construct {g n } and {G„}, we make repeated use of the Levi theorem for 
sequences in L(I). First we define a sequence {G nl } as follows: 


G n ,i(x) = max {fix), fix ), . . . ,/„(*)}. 

Each function G nl e L(I), by Theorem 10.16, and the sequence {G B>1 } is in- 
creasing on I. Since |G n l (x)| < g{x) almost everywhere on /, we have 

<V, < f l^.il < | 9- (17) 

Therefore the increasing sequence of numbers {j, G n l } is bounded above by 
jz g, so limn^^ G n t exists. By the Levi theorem, the sequence {G B>1 } converges 
almost everywhere on / to a function G t in L(I), and 




Because of (17) we also have the inequality — j r g < G,. Note that if x is a 

point in / for which G n l (x) -* G,(x), then we also have 


G,(x) = sup {fi(x),f 2 (x ), . . . }. 


In the same way, for each fixed r > 1 we let 


G„, r (x) = max {f r (x),f r+l (x ), . . . , f n (x)} 

for n > r. Then the sequence {G B>r } increases and converges almost everywhere 
on / to a limit function G r in L(I) with 



Also, at those points for which G n r (x) -*■ G r (x) we have 


so 


G r (x) = sup {f r (x),f r+ ,(*), . . . 
f r (x) < G r (x) a.e. on I. 


Now we examine properties of the sequence (G n (x)}. Since A £ B implies 
sup A < sup B, the sequence (G r (x)} decreases almost everywhere and hence 
converges almost everywhere on /. We show next that G„(x) -*■ f(x) whenever 

lim f„(x) = f{x). 

It-* oo 

If (18) holds, then for every e > 0 there is an integer N such that 

f(x) — e < fix) < fix) + e for all n > N. 


(18) 
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Hence, if m ^ N we have 

/(*) - e < sup {fjx),f m+1 (x), ...} < f(x) + e. 

In other words, 

m > N implies f(x) — e < G m (x) < f(x) + e, 
and this implies that 

lim G m (x ) = f(x) almost everywhere on I. (19) 

m-+ oo 

On the other hand, the decreasing sequence of numbers (J 7 G n } is bounded below 
by — f/ g, so it converges. By (19) and the Levi theorem, we see that f e L(I) and 

lim f G„ = f /. 

"■ >0 ° Ji Ji 

By applying the same type of argument to the sequence 

9nA X ) = min ifr(x),fr+l(x), ■ ■ ■ 

for n > r, we find that {g n>r } decreases and converges almost everywhere to a 
limit function g r in £(/), where 


g r (x) = inf {f r (x),f r + x (x), . . . } a.e. on /. 

Also, almost everywhere on / we have g r (x) < f,(x), {g r } increases, lim^^.^ g n (x) = 
f(x), and 

lim [ 9 n = f /• 

"■*°° Ji Ji 

Since (16) holds almost everywhere on I we have J, g n < J 1 f n < J 7 G„. Letting 
n -*• oo we find that {J 7 /„} converges and that 



10.11 APPLICATIONS OF LEBESGUE’S DOMINATED CONVERGENCE 
THEOREM 

The first application concerns term-by-term integration of series and is a companion 
result to Levi’s theorem on series. 

Theorem 10.28. Let {g„} be a sequence of functions in L(I ) such that: 

a) each g„ is nonnegative almost everywhere on /, 
and 

b) the series £„ = , g„ converges almost everywhere on I to a function g which is 
bounded above by a function in L(I). 
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Then g e L(I), the series y , ”_ 1 J, g n converges, and we have 

/• oo oo /» 

2^ On = I 9n- 

Jl " = 1 *=1 J/ 

Proof. Let 

n 

/»(*) = J2 9k(x) if x el. 

k= 1 

Then f n -»g almost everywhere on /, and {/,} is dominated almost everywhere 
on / by the function in L(I) which bounds g from above. Therefore, by the Le- 
besgue dominated convergence theorem, g e L(I ), the sequence {J// n } converges, 
and Jj g = lim n _ +00 J l f n . This proves the theorem. 

The next application, sometimes called the Lebesgue bounded convergence 
theorem , refers to a bounded interval. 

Theorem 10.29 . Let I be a bounded interval . Assume {f n } is a sequence of functions 
in L(I) which is boundedly convergent almost everywhere on L That is, assume there 
is a limit function f and a positive constant M such that 

lim f n (x) = f(x) and \f n (x)\ < M, almost everywhere on /. 

oo 

Thenfe L(I) and lim^^ J,/„ = \,f. 

Proof. Apply Theorem 10.27 with g(x) = M for all x in I. Then g e L(7), since 
/ is a bounded interval. 

note. A special case of Theorem 10.29 is Arzela’s theorem stated earlier (Theorem 
9.12). If {f„} is a boundedly convergent sequence of Riemann-integrable functions 
on a compact interval [a, b], then each f„ e L(\a, £]), the limit function 
/ g L(\a, A]), and we have 



If the limit function /is Riemann-integrable (as assumed in Arzela’s theorem), 
then the Lebesgue integral j£/is the same as the Riemann integral J£/(x) dx. 

The next theorem is often used to show that functions are Lebesgue-integrable. 

Theorem 10.30. Let {/,} be a sequence of functions in L(I) which converges almost 
everywhere on I to a limit function f. Assume that there is a nonnegative function g 
in L(I) such that 

\m\ < g(x) a.e. on I. 

Then f e L(I). 

Proof. Define a new sequence of functions {g n } on I as follows : 

g„ = max {min (/„, g), -g}. 
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Geometrically, the function g n is obtained from f n by cutting off the graph of f n 
from above by g and from below by — g, as shown by the example in Fig. 10.2. 
Then |# n (x)| < g{x) almost everywhere on /, and it is easy to verify that g n -* f 
almost everywhere on /. Therefore, by the Lebesgue dominated convergence 
theorem,/ e L(I). 


10.12 LEBESGUE INTEGRALS ON UNBOUNDED INTERVALS AS LIMITS 
OF INTEGRALS ON BOUNDED INTERVALS 

Theorem 10.31. Let fbe defined on the half-infinite interval I = [a, + oo). Assume 
that f is Lebesgue-integrable on the compact interval [a, Z>] for each b > a, and 
that there is a positive constant M such that 

b 

|/| ^ M for all b > a. 

I 

Then f e L(7), the limit lim fr _> + 00 J h a f exists , and 

+ 00 

/= lim / 

t b-+ + oo J a 




( 20 ) 

( 21 ) 


Proof. Let {b n } be any increasing sequence of real numbers with b n > a such that 
lim n _ 00 b„ = +oo. Define a sequence {/,} on / as follows: 



if a < x < b„, 
otherwise. 


Each/, e L{f) (by Theorem 10.18) and/, -* f on 1 . Hence, \f n \ -*• |/| on /. But 
\f n \ is increasing and, by (20), the sequence (j, |/,|} is bounded above by Af. 
Therefore lim B _ 00 \f n \ exists. By the Levi theorem, the limit function |/| e L(I). 
Now each \f n \ < |/| and/, -* f on I, so by the Lebesgue dominated convergence 
theorem,/ e L(I ) and limn^^ J r f n = \ t f Therefore 

f *b„ e + ao 

f= f 

_ a Ja 

for all sequences {6„} which increase to +oo. This completes the proof. 
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There is, of course, a corresponding theorem for the interval (— oo, o] which 
concludes that 



provided that |/| < M for all c <, a. If \f\ < M for all real c and b with 
c < b, the two theorems together show that f e L (R) and that 




lim 

b~* + oo 



Example 1. Let/(*) = 1/(1 4- x 2 ) for all x in R. We shall prove that / e L(R) and that 
Jr / = 7T. Now /is nonnegative, and if c < b we have 



dx 

1 4- x 2 


arctan b — arctan c < n. 


Therefore, / e L( R) and 


/•+ oo 

/= lim - 

J — oo c-+ — co J c 1 


dx 




+ lim 


f 

lm I - 

+ 0 ° Jo 1 


dx 




n , n 

= — I — = 71 . 


Example 2. In this example the limit on the right of (21) exists but / $ L(I). Let 
I = [0, 4- oo) and define /on / as follows: 


fix) 


(- 1 )' 


n 


if n — 1 < x < n, for n = 1, 2, 


If b > 0, let m = [6], the greatest integer < b. Then 


o 


f/= f/+ f 

i/o Jo Jn 


m 


/+ 1 /-E „ 

m n= 1 n 


(-1)" + (±~ m)(-l> 


m+1 


OT + 1 


As b -> +oo the last term -> 0, and we find 


lim 

b-* + oo 



- log 2. 


Now we assume fe HI) and obtain a contradiction. Let/, be defined by 


/»(*) = 


l/(*) I 

0 


for 0 ^ ^ n, 

for x > n. 


Then {f„} increases and f„(x) -* \f(x)\ everywhere on I. Since / e L(I) we also have 
|/| e L(I). But \f n (x)\ < \f(x)\ everywhere on / so by the Lebesgue dominated con- 
vergence theorem the sequence {///,} converges. But this is a contradiction since 


r p n n 

/. - | i/i - E 

J/ Jo k = 1 


1 

— — > 4* oo 
k 


as n -» oo. 
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10.13 IMPROPER RIEMANN INTEGRALS 


Definition 10.32 . 

limit 


If f is Riemann-integrable on [a, b] for every b > a, and if the 

lim f(x) dx exists , 
b_+ + 0 ° Ja 


then f is said to be improper Riemann-integrable on \a, +oo) and the improper 
Riemann integral of f denoted by J * 00 f(x) dx or J® f(x) dx, is defined by the 
equation 



f(x) dx = 


lim 

b-+ + oo 


f f(x) dx. 


In Example 2 of the foregoing section the improper Riemann integral 
Jo °°/(*) dx exists but / is not Lebesgue-integrable on [0, +oo). That example 
should be contrasted with the following theorem. 

Theorem 10.33. Assume f is Riemann-integrable on [a, 6] for every b > a, and 
assume there is a positive constant M such that 



|/(jc)| dx < M 


for every b > a. 



Then both f and \f\ are improper Riemann-integrable on [a, + oo). Also, f is 
Lebesgue-integrable on [a, + oo) and the Lebesgue integral of f is equal to the im- 
proper Riemann integral of f 


Proof Let F{b) = J* \ f(x)\ dx . Then Fis an increasing function which is bounded 
above by M, so lim*.* + ^ F(b) exists. Therefore |/| is improper Riemann-integrable 
on [a, + oo). Since 


the limit 


0 < |/(x) | - f{x) < 2\f(x)\, 



{!/(*) I - /(*)} dx 


also exists ; hence the limit lim fc _ + n f(x) dx exists. This proves that /is improper 
Riemann-integrable on [a, + oo). Now we use inequality (22), along with Theorem 
10.31, to deduce that /is Lebesgue-integrable on [a, +oo) and that the Lebesgue 
integral of / is equal to the improper Riemann integral of /. 


NOTE. 

form 


There are corresponding results for improper Riemann integrals of the 


L 

I 


f(x) dx = lim I fix) dx, 


f(x) dx = 


r 

Ja 

= lirn r 

Ja 


f(x) dx. 
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and 


f{x) dx = 


= lim P 

a ~* c + Ja 


fix) dx. 


which the reader can formulate for himself. 


If both integrals Ji „ fix) dx and | 0 + 00 f(x) dx exist, we say that the integral 
J-oo/00 dx exists, and its value is defined to be their sum, 


fix) dx = f(x) dx + 


fix) dx. 


If the integral J t “ fix) dx exists, its value is also equal to the symmetric limit 


lim I fix) dx. 
b-> + oo J _ b 


However, it is important to realize that the symmetric limit might exist even when 
l-S/C*) dx does not exist (for example, take fix) - x for all x). In this case the 
symmetric limit is called the Cauchy principal value of ® fix) dx. Thus ® x dx 
has Cauchy principal value 0, but the integral does not exist. 

Example 1. Let fix) = e~ x x y ~ 1 , where y is a fixed real number. Since e~ x,2 x y ~ x -> 0 
as x -* +oo, there is a constant M such that e~ x,2 x y ~ l < M for all x > 1. Then 
e~ x x y ~ x < Me~ xl2 , so 

\f{x)\ dx < M J e~ xl2 dx = 2M(1 - e~ b ' 2 ) < 2 M. 

Hence the integral J^ 00 e~ x x y ~ 1 dx exists for every real y, both as an improper Riemann 
integral and as a Lebesgue integral. 

Example 2. The Gamma function integral. Adding the integral of Example 1 to the 
integral Jq e~ x x y ~ 1 dx of Example 2 of Section 10.9, we find that the Lebesgue integral 


r(>0 = 


-r 


e 1 dx 


exists for each real y > 0. The function T so defined is called the Gamma function. 
Example 4 below shows its relation to the Riemann zeta function. 

note. Many of the theorems in Chapter 7 concerning Riemann integrals can be 
converted into theorems on improper Riemann integrals. To illustrate the straight- 
forward manner in which some of these extensions can be made, consider the 
formula for integration by parts : 

I fix)g'ix) dx = fib)gib) - fia)gia) - f gix)f'(x) dx. 


Since b appears in three terms of this equation, there are three limits to consider 
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as b -> +oo. If two of these limits exist, the third also exists and we get the 
formula 


f f(x)g'(x) dx = lim f(b)g(b) - f(a)g(a) - j g{x)f{x) dx. 

Ja b-* + ao J a 

Other theorems on Riemann integrals can be extended in much the same way 
to improper Riemann integrals. However, it is not necessary to develop the details 
of these extensions any further, since in any particular example, it suffices to apply 
the required theorem to a compact interval [a, 6] and then let b — ► + oo. 


Example 3. The functional equation T(y + 1) = yT(y). If 0 < a < b, integration by 
parts gives 



dx = (?e~ a - t?e~ h 


+ y 



e X x? 1 dx. 


Letting a -> 0+ and b -> + oo, we find T(y + 1) = A'Hy). 


Example 4* Integral representation for the Riemann zeta function . The Riemann zeta 
function C is defined for s > 1 by the equation 


CM 


o° 1 

si' 

n= 1 n 


This example shows how the Levi convergence theorem for series can be used to derive an 
integral representation, 

/•oo ^-1 

f(s)roo = - — - dx. 

J 0 e* - 1 


The integral exists as a Lebesgue integral. 

In the integral for T(j) we make the change of variable / = nx 9 n > 0, to obtain 

T(j) = f e~ x t s ~ x dt = n s f e~ nx x s ~ l dx. 

Jo Jo 

Hence, if s > 0, we have 

h“TM = J dx. 

If s > 1, the series X£=i n~ s converges, so we have 

C(s)r(s) = T €rm * 3f ~ 1 dx > 

n= 1 JO 

the series on the right being convergent. Since the integrand is nonnegative, Levi’s con- 
vergence theorem (Theorem 10.25) tells us that the series £® =1 e~ nx X s- 1 converges 
almost everywhere to a sum function which is Lebesgue-integrable on [0, + oo) and that 

cwm = ± r ‘-v- 1 * = r e e_,,xjcs ' 1 

n= 1 Jo Jo n= 1 
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But if x > 0, we have 0 < e x < 1 and hence. 


E - 

n= 1 



the series being a geometric series. Therefore we have 


9 


22 « ~ nx ^~ 1 
n — 1 



1 



1 


almost everywhere on [0, + oo), in fact everywhere except at 0, so 


C(s)TXs) = 



e-***- 1 dx = 




-l 


e* - 1 


dx . 


10.14 MEASURABLE FUNCTIONS 

Every function / which is Lebesgue-integrable on an interval I is the limit, almost 
everywhere on /, of a certain sequence of step functions. However, the converse 
is not true. For example, the constant function / = 1 is a limit of step functions 
on the real line R, but this function is not in L{ R). Therefore, the class of functions 
which are limits of step functions is larger than the class of Lebesgue-integrable 
functions. The functions in this larger class are called measurable functions. 

Definition 10.34 . A function f defined on / is called measurable on /, and we write 
f e M(I), if there exists a sequence of step functions {^ w } on I such that 

lim s n {x) = f{x) almost everywhere on /. 

n-+ oo 

note. If / is measurable on I then / is measurable on every subinterval of I . 

As already noted, every function in L(I) is measurable on I, but the converse 
is not true. The next theorem provides a partial converse. 

Theorem 10.35. If f e M(I) and if |/(x)| < g(x) almost everywhere on I for some 
nonnegative g in L(I), then f e L(I). 

Proof There is a sequence of step functions {$„} such that s n (x) -* f(x) almost 
everywhere on 7. Now apply Theorem 10.30 to deduce that / e L(I). 

Corollary 1. If f e M(I) and |/| 6 7.(7), then f e 7.(7). 

Corollary 2. If f is measurable and bounded on a bounded interval I, then f e 7.(7). 

Further properties of measurable functions are given in the next theorem. 

Theorem 10.36. Let q> be a real-valued function continuous on R 2 . Iff e M(I) and 
g 6 M(I), define h on I by the equation 

h{x) = <p[f(x), #(x)]. 
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Then h e M(I). In particular, f + g, f-g, \f\, max (/, g), and min if g) are in 
Mil). Also, 1 /f e M{I) if f{x) / 0 almost everywhere on I. 

Proof. Let {.s 1 ,,} and {/„} denote sequences of step functions such that s n ~* f and 
t„ -» g almost everywhere on I. Then the function u n — (p(s„, t n ) is a step function 
such that u n —* h almost everywhere on I. Hence h e M(I). 

The next theorem shows that the class M(I) cannot be enlarged by taking 
limits of functions in Mil). 

Theorem 10.37. Let f be defined on I and assume that {/„} is a sequence of measur- 
able functions on I such that f„(x) — > f(x) almost everywhere on I. Then f is measur- 
able on I. 

Proof Choose any positive function g in L(7), for example, g(x) = 1/(1 + x 2 ) 
for all x in 7. Let 


Then 


F n (x) = g(x) 


L(x) 

1 + |/„(x)| 


for x in 7. 


F n (x) 


g(x)fjx) 

1 + l/WI 


almost everywhere on 7. 


Let F(x) = g(x)f(x)/{ 1 + |/(x)|}. Since each F„ is measurable on 7 and since 
< g{x) for all x. Theorem 10.35 shows that each F n e L(F). Also, |F(x)| < 
g(x) for all x in 7 so, by Theorem 10.30, F e L(7) and hence F e M(I). Now we 
have 


f(x){g(x) - |F(x)|} =/(x)^(x){l - X = - /(X) ^ X > = F(x) 

I 1 + |/(x)|J .1 + |/(x)| 


for all x in /, so 



Fix) 

g(x) - \F(x)\ 


Therefore f e M(I) since each of F, g , and |F| is in M(I) and g(x) — |F(x)| > 0 
for all x in /. 


note. There exist nonmeasurable functions, but the foregoing theorems show that 
it is not easy to construct an example. The usual operations of analysis, applied to 
measurable functions, produce measurable functions. Therefore, every function 
which occurs in practice is likely to be measurable. (See Exercise 10.37 for an 
example of a. nonmeasurable function.) 
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10.15 CONTINUITY OF FUNCTIONS DEFINED BY LEBESGUE INTEGRALS 

Let / be a real-valued function of two variables defined on a subset of R 2 of the 
form X x Y, where each of X and Y is a general subinferval of R. Many functions 
in analysis appear as integrals of the form 


F(y) = 


f f(x, y) dx. 




X 


We shall discuss three theorems which transmit continuity, differentiability, and 
integrability from the integrand / to the function F. The first theorem concerns 
continuity. 


Theorem 10.38. Let X and Y be two subintervals of R, and let fbe a function defined 
on X x Y and satisfying the following conditions: 

a) For each fixed y in Y, the function f y defined on X by the equation 

f y (x) = fix, y) 

is measurable on X. 

b) There exists a nonnegative function g in L(X) such that, for each y in Y , 

I /(*, y) I < g(x) a.e. on X. 

c) For each fixed y in Y , 

lim /(x, t) = /(x , y) a.e. on X. 
t-*y 

Then the Lebesgue integral j* /(x, y) dx exists for each y in Y , and the function F 
defined by the equation 

F(y) = f fix, y) dx 

is continuous on Y. That is, if y e Y we have 

lim I f(x, t) dx = I lim f(x, t ) dx. 

Jx Jx *-x 

Proof Since f y is measurable on X and dominated almost everywhere on X by a 
nonnegative function g in L(X), Theorem 10.35 shows that f y e L(X). In other 
words, the Lebesgue integral j* f(x, y) dx exists for each y in Y. 

Now choose a fixed y in Y and let {y„} be any sequence of points in Y such that 
lim y„ = y. We will prove that lim F(y„) = F(y). Let G n (x) = /( x, y n ). Each 
G n e L(X) and (c) shows that G n (x) -*■ f(x, y) almost everywhere on X. Note that 
F(y n ) = J x Gfx) dx. Since (b) holds, the Lebesgue dominated convergence 
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theorem shows that the sequence {F(y n )} converges and that 


lim F(y 


n-+ oo 


n) = f /(X, 


jO dx = F(y). 


Example 1. Continuity of the Gamma function r(y) = Jo °° e~ x x y ~ l dx for y > 0. We 
apply Theorem 10.38 with X = [0, + oo), Y = (0, + oo). For each y > 0 the integrand, 
as a function of x, is continuous (hence measurable) almost everywhere on X, so (a) holds. 
For each fixed x > 0, the integrand, as a function of y, is continuous on T, so (c) holds. 
Finally, we verify (b), not on Y but on every compact subinterval [a. A], where 0 < a < b. 
For each y in [a. A] the integrand is dominated by the function 


g(x) = 




[ Me~ xn 


if 0 < x < 1, 
if x > 1, 


where M is some positive constant. This g is Lebesgue-integrable on X, by Theorem 
10.18, so Theorem 10.38 tells us that F is continuous on [a, b]. But since this is true 
for every subinterval [a, b], it follows that T is continuous on Y — (0, + oo). 


Example 2. Continuity of 


F(y) = 


/• + 00 

Jo 


dx 


for y > 0. In this example it is understood that the quotient (sin x)/x is to be replaced 
by 1 when * = 0. Let X = [0, + oo), Y = (0, + oo). Conditions (a) and (c) of Theorem 
10.38 are satisfied. As in Example 1, we verify (b) on each subinterval Y a = [a, + oo), 
a > 0. Since |(sin x)/x\ < 1, the integrand is dominated on Y a by the function 


9 (x) = e 


— ax 


for x > 0. 


Since g is Lebesgue-integrable on X, F is continuous on Y a for every a > 0; hence F is 
continuous on Y = (0, + oo). 

To illustrate another use of the Lebesgue dominated convergence theorem we 
shall prove that F(y) -* 0 as y -* + oo. 

Let {y M } be any increasing sequence of real numbers such that y n > 1 and 
y„ -* + oo as n -* oo. We will prove that F(y„) -» 0 as n -* oo. Let 

f n (x) = e~ xyn for x ^ 0. 


Then lim,,_ „ /„(*) = 0 almost everywhere on [0, + oo), in fact, for all x except 0. 


Now 


y„ k 1 implies |/„(x)| < e 


- X 


for all x > 0. 


Also, each f„ is Riemann-integrable on [0, A] for every A > 0 and 


f \fn\ ^ P 

Jo Jo 


e x dx < 1. 
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Therefore, by Theorem 10.33, f n is Lebesgue-integrable on [0, + oo). Since the 
sequence {/,} is dominated by the function g(x) = e~ x which is Lebesgue-inte- 
grable on [0, + oo), the Lebesgue dominated convergence theorem shows that the 
sequence {Jo 00 / B } converges and that 



But Jo °°/„ = F(y„), so F(y„) -» 0 as n -* oo. Hence, F(y) -*• 0 as y -*• + oo. 

note. In much of the material that follows, we shall have occasion to deal with 
integrals involving the quotient (sin x)/x. It will be understood that this quotient 
is to be replaced by 1 when x = 0. Similarly, a quotient of the form (sin xy)/x is 
to be replaced by y, its limit as x -* 0. More generally, if we are dealing with an 
integrand which has removable discontinuities at certain isolated points within 
the interval of integration, we will agree that these discontinuities are to be “re- 
moved” by redefining the integrand suitably at these exceptional points. At points 
where the integrand is not defined, we assign the value 0 to the integrand. 


10.16 DIFFERENTIATION UNDER THE INTEGRAL SIGN 

\ 

Theorem 10.39. Let X and Y be two subintervals o/R, and let fbea function defined 
on X x Y and satisfying the following conditions: 

a) For each fixed y in Y, the function f y defined on X by the equation f y (x) = f(x, y) 
is measurable on X, and f a e L(X) for some a in Y. 

b) The partial derivative D 2 f(x, y) exists for each interior point (x, y) of X x Y. 

c) There is a nonnegative function G in L(X) such that 

| D 2 f(x, y)| < G(x) for all interior points ofXx Y. 

Then the Lebesgue integral J* f(x, y) dx exists for every y in Y, and the function F 
defined by 

F(y) = J* f(x, y ) dx 

is differentiable at each interior point of Y. Moreover, its derivative is given by the 
formula 

F'(y) = J D 2 f(x, y) dx. 

note. The derivative F'(y) is said to be obtained by differentiation under the 
integral sign. 

Proof. First we establish the inequality 

l/,(*)l < 1/aWI + \y-a\ G(x), 


(23) 
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for all interior points (x, y) of X x Y. The Mean-Value Theorem gives us 

/(*, y) “ /(*, <*) = (y - a ) D i /(*> < 0 , 

where c lies between a and y . Since | D 2 f(x 9 c)\ < G(x), this implies 


I /(*, y)\ < |/(x, a) | + \y - a\ G(x), 

which proves (23). Since f y is measurable on X and dominated almost everywhere 
on X by a nonnegative function in L(X), Theorem 10.35 shows that f y eL(X). 
In other words, the integral J* /(x, y) dx exists for each y in Y . 

Now choose any sequence {y n } of points in Y such that each y n # y but 
lim y n = y. Define a sequence of functions { q „} on X by the equation 

_ /.a fix, y n ) - /(*> y ) 

Hn\ x ) • 

y n - y 


Then q„ e L( X) and q H {x) ->■ D 2 f(x, y ) at each interior point of X. By the Mean- 
Value Theorem we have q„(x) = D 2 f(x, c„), where c n lies between y n and y. Hence, 
by (c) we have |? b (jc)| < G(x) almost everywhere on X. Lebesgue’s dominated 
convergence theorem shows that the sequence {J x q„} converges, the integral 
Jx D 2 f{x, y) dx exists, and 

lim q„ = I lim q„ = I D 2 f(x, y) dx. 

Jx Jx n_>0 ° Jx 

But 




- f(x, y)} dx 


F(y„ ) ~ F(y) 

y n - y 


Since this last quotient tends to a limit for all sequences {y„}, it follows that F'{y) 
exists and that 


/» 


/• 


F'(y) 


lim q„ 

n-*oo Jx 


D 2 f(x, y) dx. 


Example 1. Derivative of the Gamma function. The derivative r'(y) exists for each y > 0 
and is given by the integral 

/•+ 00 

P(y) = I e~ x x y ~ 1 log x dx, 

obtained by differentiating the integral for T(y) under the integral sign. This is a conse- 
quence of Theorem 10.39 because for each y in [a, b], 0 < a < b, the partial deriva- 
tive D 2 (e~ x x y ~ 1 ) is dominated a.e. by a function g which is integrable on [0, + oo). In 
fact, 

D 2 (e~ x x y ~ i ) = — = e~ x x y ~ 1 logx 

by 


if x > 0, 
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so if y > a the partial derivative is dominated (except at 0) by the function 


( x? 1 |log x\ if 0 < x < 1, 


g(x) = 


Me~ x/2 

0 


if x > 1, 
if x = 0, 


where M is some positive constant. The reader can easily verify that g is Lebesgue- 
integrable on [0, + oo). 

Example 2. Evaluation of the integral 


F(y) — 


I * +00 

Jo 


_v V sm x , 
r y ax. 


Applying Theorem 10.39, we find 


/•+ 00 

F'(y) = - I 


e xy sin x dx if y > 0. 


(As in Example 1, we prove the result on every interval Y a = [a, + oo), a > 0.) In this 
example, the Riemann integral Jo e~ xy sin x dx can be calculated by the methods of 
elementary calculus (using integration by parts twice). This gives us 


j: 


e xy sin x dx = 


e by {—y sin b — cos b) 


i + y 


+ 


l 


1 + y 


(24) 


for all real y. Letting b -> + oo we find 


I 


+ 00 


e xy sin x dx = 


i + y 


if y > 0. 


Therefore F'(y) = - 1/(1 + y 2 ) if y > 0. Integration of this equation gives us 


F(y) - F(b) = 


-JT 


dt 


1 + t 


= arctan b — arctan y, for y > 0, b > 0. 


Now let b + oo. Then arctan b -* nil and F(b) -> 0 (see Example 2, Section 10.15), 
so F(y) = nj 2 — arctan y. In other words, we have 


l 


+ oo 


xy 


sm x 


71 


dx — arctan y if y > 0. 

x 2 


(25) 


This equation is also valid if y = 0. That is, we have the formula 


I 


+ oo 


sm x , 7i 
dx = - . 


(26) 


However, we cannot deduce this by putting y = 0 in (25) because we have not shown that 
F is continuous at 0. In fact, the integral in (26) exists as an improper Riemann integral. 
It does not exist as a Lebesgue integral. (See Exercise 10.9.) 
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Example 3. Proof of the formula 


• ■ 
Jo 


+ 00 sinjc , f b sin x , n 

ax = 


dx — lim 

X b 


r sii 
im I — 

► + 00 Jo 


Let {g n } be the sequence of functions defined for all real y by the equation 


9n(y) 


r 


sin x , 
e y dx. 


( 27 ) 


First we note that g n (ri) -► 0 as n -► oo since 


\ffn(n) | 


( »n 1 r» 2 1 

e xn dx = - I e % dt < — 

_ 0 «Jo * 


Now we differentiate (27) and use (24) to obtain 


9n(y) = 


-I 


e sin x dx — 


e ny (—y sin n — cos n) + 1 
1 + y 2 


an equation valid for all real y. This shows that g' n (y) -> — 1/(1 + y 2 ) for all y and that 


\9n(y)\ ^ 


e y (y + 1) + 1 


1 + y 


for all y > 0. 


Therefore the function f n defined by 


(0 if y > n , 


is Lebesgue-integrable on [0, + oo) and is dominated by the nonnegative function 


g(y) = 


e y (y + 1) + 1 
1 + y 2 


Also, g is Lebesgue-integrable on [0, + oo). Since f n (y) -> — 1/(1 + y 2 ) on [0, + oo), the 
Lebesgue dominated convergence theorem implies 


/*+ OO /•+ 00 

lim /» = " 7 

»->® Jo Jo 1 


dy _ n 
+ y 2 2 


But we have 


(•+ oo /»n 

fn= \ 9n 

Jo Jo 


g’„(y) dy = 9 „(n) - g„( 0). 


Letting n -> oo, we find # n (0) -> tt/2. 

Now if b > 0 and if n = [b], we have 


* b sin x , f " sin x , f 5 sin x , , C b sin x . 

dx = I dx + dx = #„(0) + I dx. 

Jn 


C b - sin x , f " si: 

i.— JL~ 


I- 


JC 


JC 
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Since 


0 < 



< 



b — n 1 
< ► 0 


n 


n 


as b — ► +oo. 


we have 


C b sin x , n 

lim I dx = lim g n { 0) = - . 

b-+ + oo J 0 X n-M» 2 


This formula will be needed in Chapter 1 1 in the study of Fourier series. 


10.17 INTERCHANGING THE ORDER OF INTEGRATION 


Theorem 10.40 . Let X and Y be two subintervals of R, and let k be a function which 
is defined , continuous , and bounded on X x Y 9 say 

I k(x 9 y)\ < M for all (x 9 y) in X x Y. 

Assume fe L(X) and g e L(Y). Then we have: 

a) For each y in Y 9 the Lebesgue integral j* f{x)k{x 9 y) dx exists 9 and the function 
F defined on Y by the equation 

F(y) = j f(x)k(x, y ) dx 

is continuous on Y. 

b) For each x in X, the Lebesgue integral JV g(y)k(x, y) dy exists, and the function 
G defined on X by the equation 


G(x) 



9(y)k (x, y) dy 


is continuous on X . 

c) The two Lebesgue integrals g(y)F(y) dy and \ x f(x)G(x) dx exist and are 
equal That is 9 



g(y)k(x, y) dy dx = j g(y) f 


f(x)k(x, y) dx dy. 

X 



Proof For each fixed y in Y, let f y (x) = f(x)k(x, y). Then f y is measurable on X 
and satisfies the inequality 


\fjx)\ = I f(x)k(x, y ) | < M\f(x)\ for all x in X. 
Also, since k is continuous on X x Y we have 


lim f(x)k(x, t ) = f(x)k(x, y) for all x in X. 

t~*y 

Therefore, part (a) follows from Theorem 10.38. A similar argument proves (b). 
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Now the product /• G is measurable on X and satisfies the inequality 


|/(x)(7(x)| < 1/00 1 


\9(y)\ l&(*, >01 dy < M' \f(x)\, 

Y 


where M' = M Jy \g(y)\ dy. By Theorem 10.35 we see that f-Ge L(X). A 
similar argument shows that g • F e L(Y). 

Next we prove (28). First we note that (28) is true if each of /and g is a step 
function. In this case, each of / and g vanishes outside a compact interval, so each 
is Riemann-integrable on that interval and (28) is an immediate consequence of 
Theorem 7.42. 

Now we use Theorem 10.19(b) to approximate each of / and g by step functions. 
If e > 0 is given, there are step functions s and t such that 


I/- 

Jx 


s| < e and 


m 

1 9 

Jr 


tl < e. 


Therefore we have 


[ /-G= [ s- 
Jx Jx 


G + A lt 


(29) 


where 


\A,\ = |J (/- s) ' G < £ 


Gl < 1 |/ - s| 

X 


lflO)l \k( x > >01 dy < eM j* \g\. 
r Jr 


Also, we have 


where 


Therefore 


G(x) = j g(y)k(x, y) dy = t(y)k(x, y) dy + A 2 , 
Jr Jr 


\A 2 \ = 


A 

= (9 ~ 

Jr 


t)k(x, y) dy 


< M j \g 


t\ < sM. 


r r r r 

I s • G == s(x) 

Jx Jx LJ* 


G = I s(x) | | t(y)k(x , y ) dy | dx + A 3 , 
x 


; 


l^al = l A 2 I s(x) dx 

^ x 


l 


< eM | |s| 
x 


1 


< eM (* {|s - /| + |/|} < e 2 M + eM f [/!, 
Jx Jx 


where 
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so (29) becomes 


1 ^ G = I [ f ^ dy ~\ dx + A i + A . 


(30) 


Similarly, we find 


J 9 • F = J* t(y) £ J* s(x)fc(x, y) dx 1 dy + + B 3 , 


(31) 


where 


^ < bM \ 


|Bil < eMl |/| and |B 3 | < eM 


1 1 < e 2 M -f eM j \g\. 
y Jr 


But the iterated integrals on the right of (30) and (31) are equal, so we have 


1 


fG 


9 ■ F 


^ \ A i\ + M3I + l^il + IR3I 


< 2 e 2 M + 2 eM | 


I/I + f |0| 

X JY 


Since this holds for every e > Owe have \ x f- G = $ Y g ■ F, as required. 

note. A more general version of Theorem 10.40 will be proved in Chapter 15 
using double integrals. (See Theorem 15.6.) 


10.18 MEASURABLE SETS ON THE REAL LINE 


Definition 10.41 . Given any nonempty subset S of R. The function Xs defined by 


Xs(*) = 


1 

0 


if xe S, 
ifxe R - S, 


is called the characteristic function of S. If S is empty we define x s (x) = 0 for all x. 

Theorem 10.42. Let R = (— oo, +oo). Then we have: 

a) If S has measure 0, then x s e L(R) and J R Xs = 0. 

b) If Xs e L(K) and if J R Xs = 0, then S has measure 0. 


Proof Part (a) follows by taking f = Xs in Theorem 10.20. To prove (b), let 
fn = Xs for all n. Then |/ B | = Xs so 


s 

M — 1 



R 



0. 
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By the Levi theorem for absolutely convergent series, it follows that the series 
Y- 0 0 - i f n (x) converges everywhere on R except for a set T of measure 0. If x e S, 
the series cannot converge since each term is 1 . If x $ S, the series converges 
because each term is 0. Hence T = S, so S has measure 0. 

Definition 10.43. A subset S of R is called measurable if its characteristic function 
Xs is measurable. If in addition, Xs is Lebesgue-integrable on R, then the measure 
fi(S) of the set S is defined by the equation 

MS) = f Xs- 

If Xs * s measurable but not Lebesgue-integrable on R, we define g(S) = + oo. The 
function g so defined is called Lebesgue measure. 

Examples 

1. Theorem 10.42 shows that a set S of measure zero is measurable and that g(S) = 0. 

2. Every interval I (bounded or unbounded) is measurable. If / is a bounded interval 
with endpoints a < b, then g(I) = b — a. If / is an unbounded interval, then 
g{I) = +oo. 

3. If A and B are measurable and A ^ B, then g(A) < g(B). 

Theorem 10.44. a ) If S and T are measurable , so is S — T. 
b) If S l9 S 2 , • • • , are measurable , so are S t and S t . 

Proof To prove (a) we note that the characteristic function of S — T is Xs — XsXt- 
To prove (b), let 

u„ = 0 s„ k = n u = 0 v = n 

i= 1 i= 1 i = 1 i= 1 

Then we have 

Xu„ = max (x Sl , • • • , Xs„) and Xv n = min (x Sl Xs„). 

so each of U n and V„ is measurable. Also, Xv = •im B -, 00 Xv„ and Xv = hm B _ 00 Xv n > 
so U and V are measurable. 

Theorem 10.45. If A and B are disjoint measurable sets, then 

H(A u B) = n(A) + n(B). (32) 

Proof Let S = A u B. Since A and B are disjoint we have 

Xs = Xa + Xb- 

Suppose that Xs is integrable. Since both Xa and /b are measurable and satisfy 
0 < Xa(x) < Xs(x), 0 < Xb(x) < Xs(x) for all x. 
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Theorem 10.35 shows that both Xa and Xb are integrable. Therefore 

H(S) = f Xs = I Xa + I Xb = K A ) + H(B)- 

Jr Jr Jr 

In this case (32) holds with both members finite. 

If Xs is not integrable then at least one of Xa ° r Xb is not integrable, in which 
case (32) holds with both members infinite. 


The following extension of Theorem 10.45 can be proved by induction. 


Theorem 10.46. If {A v , . . . , A n } is a finite disjoint collection of measurable sets, 
then 



H( A d- 


note. This property is described by saying that Lebesgue measure is finitely 
additive. In the next theorem we prove that Lebesgue measure is countably additive. 


Theorem 10.47. 

sets, then 


If {A i, A 2 , . . . } is a countable disjoint collection of measurable 



Proof. Let T„ — (J? =1 A h x„ = Xt„> T = Ui" i A v Since /z is finitely additive, 
we have 


n 


KT n ) = 2 KAd 

i = 1 


for each n. 


We are to prove that n(T n ) -> n(T) as n -► oo. Note that fi(T n ) < n(T n+1 ) so 
{n(T n )} is an increasing sequence. 

We consider two cases. If n(T) is finite, then Xt and each x» is integrable. Also, 
the sequence {n(T n )} is bounded above by n(T) so it converges. By the Lebesgue 
dominated convergence theorem, /z(T n ) -*• fi(T). 

If fi(T) = +oo, then Xt is not integrable. Theorem 10.24 implies that either 
some x n is not integrable or else every x n is integrable but n(T n ) -» + op. In either 
case (33) holds with both members infinite. 


For a further study of measure theory and its relation to integration, the reader 
can consult the references at the end of this chapter. 


10.19 THE LEBESGUE INTEGRAL OVER ARBITRARY SUBSETS OF R 


Definition 10.48. Let f be defined on a measurable subset S of R. 
function / on R as follows: 



fix) 

0 


if xe S, 
if x e R — S. 


Define a new 
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If / is Lebesgue-integrable on R, we say that f is Lebesgue-integrable on S and we 
write f e L(S). The integral of f over S is defined by the equation 


e 




JS J R 


This definition immediately gives the following properties : 

Iff e L(S), then f e L(T) for every subset of T of S. 

If S has finite measure, then g(S) = j s 1. 

The following theorem describes a countably additive property of the Lebesgue 
integral. Its proof is left as an exercise for the reader. 


Theorem 10.49. Let {A u A 2 , . . . } be a countable disjoint collection of sets in R, 
and let S = (J x A t . Let f be defined on S. 

a) Iff e L(S), then f e L(A t ) for each i and 



b) If f e L{Aj) for each i and if the series in (a) converges, then f e L(S) and the 
equation in (a) holds. 


10.20 LEBESGUE INTEGRALS OF COMPLEX-VALUED FUNCTIONS 

If/ is a complex- valued function defined on an interval I, then / = u + iv, where 
u and v are real. We say / is Lebesgue-integrable on I if both u and v are Lebesgue- 
integrable on I, and we define 

Similarly, / is called measurable on / if both u and v are in M(I). 

It is easy to verify that sums and products of complex-valued measurable 
functions are also measurable. Moreover, since 

|/| = (m 2 -I- v 2 ) 1 ' 2 , 

% 

Theorem 10.36 shows that |/| is measurable if / is. 

Many of the theorems concerning Lebesgue integrals of real-valued functions 
can be extended to complex-valued functions. However, we do not discuss these 
extensions since, in any particular case, it usually suffices to write f = u + iv 
and apply the theorems to u and v. The only result that needs to be formulated 
explicitly is -the following. 
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Theorem 10 JO. If a complex-valued function f is Lebesgue-integrable on I, then 
|/| 6 L(I) and we have 




Proof Write / = u + iv. Since / is measurable and |/| < |u| + |y|, Theorem 
10.35 shows that |/| e L(I). 

Let a = \if Then a = re*®, where r = \a\. We wish to prove that r < |/j. 

Let 

b = \ e * if r > °» 

1 1 if r = 0. 


Then |A| = 1 and r = ba = b J,/ = j, bf Now write bf = U + iV, where 
U and V are real. Then j/ bf = U, since J/ bf is real. Hence 



10.21 INNER PRODUCTS AND NORMS 

This section introduces inner products and norms, concepts which play an im- 
portant role in the theory of Fourier series, to be discussed in Chapter 11. 

Definition 10.51. Let f and g be two real-valued functions in L(I) whose product 
/• g is in L(I). Then the integral 

j f(x)g(x) dx (34) 

is called the inner product of f and g, and is denoted by (f, g). If f 2 e Iff), the 
nonnegative number ( f,f ) 1/2 , denoted by ||/||, is called the L 2 -norm off 

note. The integral in (34) resembles the sum £t=i x k y k which defines the dot 
product of two vectors x = (x |f . . . , x„) and y = (^, . . . , y n ). The function 
values /(x) and g(x) in (34) play the role of the components x k and y k , and integra- 
tion takes the place of summation. The L 2 -norm of /is analogous to the length of 
a vector. 

The first theorem gives a sufficient condition for a function in L(J) to have an 
L 2 -norm. 

Theorem 10 J2. If f e Iff) and if f is bounded almost everywhere on L then 

f 2 e L(I). 

Proof Since/ e If Iff is measurable and hence f 2 is measurable on I and satisfies 
the inequality |/(x)| 2 < Af|/(x)| almost everywhere on I, where M is an upper 
bound for |/|.- By Theorem 10.35, f 2 e If I). 
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10.22 THE SET L 2 (I) OF SQUARE-INTEGRABLE FUNCTIONS 

Definition 10.53. We denote by L 2 (/) the set of all real-valued measurable functions 
f on I such that f 2 e Iff). The functions in L 2 (/) are said to be square-integrable. 

note. The set L 2 (/) is neither larger than nor smaller than L(I). For example, 
the function given by 

f(x) = x~ t/2 forO < x < 1, /( 0) = 0, 

is in L([0, 1]) but not in L 2 ([0, 1]). Similarly, the function g(x) = 1/x for x > 1 
is in L 2 ([l, + oo)) but not in L([ 1, + oo)). 

Theorem 10.54. If f e 11(f) and g e L 2 (7), then f-ge L(I) and ( af + bg) e L 2 (/) 
for every real a and b. 

Proof Both / and g are measurable so /• g e M(I). Since 

1/MoOOI S , 

Theorem 10.35 shows that f-ge L(I). Also, (af + bg) e M(I) and 

(af + bg) 2 = a 2 / 2 + 2 abfg + b 2 g 2 , 

so (af + bg) e Iff). 

Thus, the inner product (f g) is defined for every pair of functions / and g in 
if (I). 'The basic properties of inner products and norms are described in the next 
theorem. 


Theorem 10.55. Iff, g, and h are in If (I) and if c is real we have: 


a) (f, g) = (g,f) 

b) (/ + g, h) = (f h) + (g, h) 

c) ( c f g) = c(f g) 

d) lk/|| = Id ll/ll 

e) \(f, g) I < ll/ll Ik I! 

0 11/ + g\\ < ll/ll + Ikll 


(commutativity). 

(linearity). 

(associativity). 

(homogeneity). 

(Cauchy-Schwarz inequality), 
(triangle inequality). 


Proof. Parts (a) through (d) are immediate consequences of the definition. Part (e) 
follows at once from the inequality 



I f(x)g(y) - g(x)f(y) I 2 dy 


dx > 0. 


To prove (f) we use (e) along with the relation 


11/ + 0ll 2 = (/+ g,f+ g) = (//) + 2 (f,g) + (g,g) = ||/|| 2 + |k|| 2 + 2 (f,g). 
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note. The notion of inner product can be extended to complex-valued functions 
/ such that |/| g L 2 (/). In this case, (/, g) is defined by the equation 

if, 9) = j f(x)g(x) dx. 


where the bar denotes the complex conjugate. The conjugate is introduced so that 
the inner product of / with itself will be a nonnegative quantity, namely, 
(/,/) = Si I/I 2 - The L 2 -norm of /is, as before, ||/|| = iff) 1 ' 2 . 

Theorem 10.55 is also valid for complex functions, except that part (a) must be 
modified by writing 

if 9) = W)- (35) 

This implies the following companion result to part (b) : 

if 9 + h) = (g + hj) = {gj) + ThJ) = if g) + if, h). 

In parts (c) and (d) the constant c can be complex. From (c) and (35) we obtain 

if eg) = c(f, g). 

The Cauchy-Schwarz inequality and the triangle inequality are also valid for 
complex functions. 


10.23 THE SET L 2 (I) AS A SEMIMETRIC SPACE 

We recall (Definition 3.32) that a metric space is a set T together with a nonnegative 
function d on T x T satisfying the following properties for all points jc, y, z in T : 

1 . d(x, x) = 0. 2. d(x, y) > 0 if x ^ y. 

3. d( x, y) = d(y, x). 4. d(x, y) <, d(x, z) + d(z, y). 

We try to convert L 2 (7) into a metric space by defining the distance d(f, g) between 
any two complex-valued functions in L 2 (/) by the equation 

dif 9 ) = 11/ - g\\ = M 1/ - g\ 2 j . 

This function satisfies properties 1, 3, and 4, but not 2. If / and g are functions in 
L 2 (/) which differ on a nonempty set of measure zero, then / ± g but f — g = 0 
almost everywhere on /, so d(f, g) = 0. 

A function d which satisfies 1 , 3, and 4, but not 2, is called a semimetric. The 
set L 2 (7), together with the semimetric d, is called a semimetric space. 


10.24 A CONVERGENCE THEOREM FOR SERIES OF FUNCTIONS IN L 2 (I) 

The following convergence theorem is analogous to the Levi theorem for series 
(Theorem 10.26). 
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Theorem 10.56. Let {g„} be a sequence of functions in !?(/) such that the series 

00 

Z iisJ 

n=l 

converges. Then the series of functions j g n converges almost everywhere on I 

to a function g in L 2 (7), and we have 



lim 

n~* oo 


n 


Z 9k 

k= 1 


oo 

^ E II 0*11- 


k= 1 



Proof. Let M = ||^ w ||. The triangle inequality, extended to finite sums, 

gives us 


This implies 



n 

^ Z II 0*11 < A/. 


k= 1 


If x e I, let 




The sequence {/„} is increasing, each /„ e L(7) (since each g k e L 2 (/)), and (37) 
shows that J,/„ < Af 2 . Therefore the sequence {Jj/,} converges. By the Levi 
theorem for sequences (Theorem 10.24), there is a function / in L(I) such that 
f n ~* f almost everywhere on I, and 



Therefore the series £"= t g k {x) converges absolutely almost everywhere on /. Let 


n 

g(x) = lim 2 9k(x) 

n ~* oo fc= 1 


at those points where the limit exists, and let 


G„(x) = 


r» 

Z #*(*) 


i 


Then each G„ e L(1 ) and G„(x) -+ |^(jc)| 2 almost everywhere on I. Also, 

G„(x) < f„(x) < f(x) a.e. on I. 

Therefore, by the Lebesgue dominated convergence theorem, \g\ 2 e L(I) and 


i 


\g\ 2 = lim 


n~* oo 


J> 


(38) 
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Since g is measurable, this shows that g e L 2 (7). Also, we have 


l «■ - J, Is 


9k 


2 

n 

2 

, and 

* /* 

G n ^ 


k= 1 

• 

I Jj 


fn < M 2 , 


so (38) implies 


9 II 2 = lim 


n~> oo 


n 

E 

k = l 


9k 


< M 2 , 


and this, in turn, implies (36). 


10.25 THE RIESZ-FISCHER THEOREM 


The convergence theorem which we have just proved can be used to prove that 
every Cauchy sequence in the semimetric space L 2 (/) converges to a function in 
L 2 (7). In other words, the semimetric space L 2 (7) is complete. This result, called 
the Riesz-Fischer theorem, plays an important role in the theory of Fourier series. 

Theorem 10.57. Let {/„} be a Cauchy sequence of complex-valued functions in 
L 2 (7). That is, assume that for every e > 0 there is an integer N such that 

|| f m — /„|| < e whenever m > n > N. (39) 

Then there exists a function f in L 2 (7) such that 

lim ||/„ - /|| = 0. (40) 

n — ► oo 

Proof By applying (39) repeatedly we can find an increasing sequence of integers 
n(l) < «( 2) < • • • such that 



whenever m > n(k). 


Let g t = / n(1) , and let g k = f n(k) - f n(k . t) for k > 2. Then the series Z/ =1 II 0*11 
converges, since it is dominated by 


oo 


liy«(l)ll + S Wfn(k) fn(Jk — 1 ) II < II/. 


k = 2 


1)1 



Each g H is in L 2 (7). Hence, by Theorem 10.56, the series Y" _ , g n converges almost 
everywhere on 7 to a function / in L 2 (7). To complete the proof we will show that 
ll/m - /|| -♦ 0 as m -► oo. 

For this purpose we use the triangle inequality to write 


ll/m -/II < ll/m -/,(*)!! + II/**) -/II. (41) 


If m > n(k), the first term on the right is <1/2*. To estimate the second term we 
note that 


/ — fn(k) 


E 

— t _L 


{ fn(r ) fn(r— 1 
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and that the series ]T?L fc+1 II f n {r) ~ f n {r- i)ll converges. Therefore, we can use 
inequality (36) of Theorem 10.56 to write 

11/ - U)i ^ ll/n(r) f n(r— 1)11 < • 

r = k+ 1 r = k+ 1 2 2 

Hence, (41) becomes 

ll/m ~ /II ^ + Jk = | * f m ^ 

Since «(fe) -> oo as k -* oo, this shows that ||/ m — /|| -*• 0 as m -* oo. 

note. In the course of the proof we have shown that every Cauchy sequence of 
functions in L 2 (/) has a subsequence which converges pointwise almost everywhere 
on I to a limit function /in L 2 (7). However, it does not follow that the sequence 
{/,} itself converges pointwise almost everywhere to / on I. (A counterexample is 
described in Section 9.13.) Although {/,} converges to /in the semimetric space 
L 2 (7), this convergence is not the same as pointwise convergence. 


EXERCISES 


Upper functions 

10.1 Prove that max (/, g) + min (/, g) = / + g, and that 

max (/ + h, g + h) = max (/, g) + h, min (/ + h, g + h) = min (/, g) + h. 

10.2 Let {f„} and {g„} be increasing sequences of functions on an interval /. Let u„ = 
max (/„, g n ) and v n = min (f n , g n ). 

a) Prove that {«„} and {»„} are increasing on I. 

b) If fn / f a.e. on / and if g n / g a.e. on /, prove that u n / max (/, g) and 
v n / min (/, g) a.e. on /. 

10.3 Let {s n } be an increasing sequence of step functions which converges pointwise on 
an interval / to a limit function f If /is unbounded and if fix ) > 1 almost everywhere on 
/, prove that the sequence {// s n } diverges. 

10.4 This exercise gives an example of an upper function / on the interval I = [0, 1 ] 
such that —ft U(I). Let {r l9 r 2 , . . . } denote the set of rational numbers in [0, 1] and 
let I n = [r n — 4“", r n + 4""] n /. Let/(x) = 1 if x e I n for some n 9 and let f(x) = 0 
otherwise. 

a) Lety^(x) = 1 if x e I n9 f n (x) = 0 if x t f n » an d let s n = max (f l9 . . . 9 fo. Show 
that {5 n } is an increasing sequence of step functions which generates /. This 
shows that / e U(I). 

b) Prove that // / < 2/3. 

c) If a step function s satisfies $(jc) < —fix) on /, show that six) < — 1 almost 
everywhere on I and hence / 7 s < — 1 . 

d) Assume that — /e t/(/) and use (b) and (c) to obtain a contradiction. 
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note. In the following exercises, the integrand is to be assigned the value 0 at points 
where it is undefined. 


Convergence theorems 

10.5 lff n (x) = e~ nx - 2e~ 2nx 9 show that 


00 I* CO y»00 00 

Y. fn ( x ) dx * \ y fn(x) dx. 

n - 1 JO Jo #i=l 


10.6 Justify the following equations: 


a) 


r 108 


b) — 1 ~ 

Jo 1 ~ X \x/ hi(n + 


dx= C£**-±±r 

Jo hi n hi nj o 


x?dx = 1 


P ) 


2 (P > 0). 


10.7 Prove Tannery’s convergence theorem for Riemann integrals: Given a sequence of 
functions {f n } and an increasing sequence {p n } of real numbers such that p n -> + oo as 
n -► oo. Assume that 


a) f n -► f uniformly on [a, b ] for every b > a. 

b) f n is Riemann-integrable on [< a , Z>] for every b > a, 

c ) \fn( x )\ ^ almost everywhere on [a 9 + oo), where g is nonnegative and im- 
proper Riemann-integrable on [a 9 + oo). 

Then both f and \ f\ are improper Riemann-integrable on [a 9 +oo), the sequence {/£"/,} 
converges , and 


r 


/(*) dx = lim 

n -►oo 



./!,(*) dx . 


d) Use Tannery’s theorem to prove that 


lim 
#!-► 00 



( •00 

0 




jc p dx 9 


if p > — 1. 


10.8 Prove Fatou’s lemma: Given a sequence {/„} of nonnegative functions in UJ) such 
that (a) {f n } converges almost everywhere on I to a limit function f and (b) J t f n <A for 
some A > 0 and all n > 1. Then the limit function f e L(7) and J 7 / < A. 

note. It is not asserted that {Sif n } converges. (Compare with Theorem 10.24.) 

H int. Let g n (x) = inf {f n (x) 9 f n+1 (x) 9 . . . }. Then^ / / *.*. on I and J 7 ^ < J 7 / n < 
^ so lim*.,*, Jj 9n exists and is <A. Now apply Theorem 10.24. 


Improper Riemann Integrals 

10.9 a) If p > 1, prove that the integral Jf 00 x~ p sin x dx exists both as an improper 
Riemann integral and as a Lebesgue integral. Hint. Integration by parts. 



The Lebesgue Integral 


300 


b) If 0 < p < 1, prove that the integral in (a) exists as an improper Riemann 
integral but not as a Lebesgue integral. Hint. Let 



and show that 


I 


I" 


nn [nit V 2 ^ 1 

x~ p Isin x\dx > \ g(x) dx > — / - • 

4 k= 2 k 


10.10 a) Use the trigonometric identity sin 2x = 2 sin x cos along with the formula 
Jo sin x/x dx = njl, to show that 


/•oo 

Jo 


sin x cos x , n 
dx = - . 


b) Use integration by parts in (a) to derive the formula 


L 


00 sin 2 x , n 
— — - dx = - . 


c) Use the identity sin 2 x + cos 2 x = 1, along with (b), to obtain 


I 


00 sin 4 x . n 
— — dx — - . 


d) Use the result of (c) to obtain 


l 


oo * A 

sin x , n 

— dx = - . 


10.11 If a > 1, prove that the integral J+ 00 x p (log xfdx exists, both as an improper 
Riemann integral and as a Lebesgue integral for all q if p < — 1 , or for q < —\\ip= — 1. 

10.12 Prove that each of the following integrals exists, both as an improper Riemann 
integral and as a Lebesgue integral. 


a) 


/•oo i 

I sin 2 - dx, 

Ji x 


b) 


( •oo 

0 


x v e~-* q dx (p > 0, q > 0). 


10.13 Determine whether or not each of the following integrals exists, either as an 
improper Riemann integral or as a Lebesgue integral. 


a) 


c) 


e) 


/•oc 

Jo 

r 

r 


,-(»*+« - 2 > 


dt. 


b) 


I 


oo 


COS X 


4 : 


dx. 


log X 

x(x 2 — 1) 1/2 


dx. 


d) | e x sin - dx. 


: 


log x sin - dx, 
x 


/•oo- 

Jo 


f ) e x log (cos 2 x) dx. 
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10.14 Determine those values of p and q for which the following Lebesgue integrals exist. 


a) I x p (l — x 2 ) 9 dx 9 


c) 


e) 


f‘ 

Jo 

f°° x ”- 1 - 

Jo 1 - 


oo x p-l 


-1 


dx. 


b) 


d) 


x*e~ x dx. 


r 

Jo 

r dx, 

Jo ** 


/•OO 

Jo 1 


+ X* 


dx 9 


/•oo 

Jn 


0 I (log x) p (sin jc) 1/3 dx. 


10.15 Prove that the following improper Riemann integrals have the values indicated 
(m and n denote positive integers). 


. , ,0 ° sin 2n+1 x j 
a) I dx = 


7t(2/i)! 

2 2 " +1 (/i!) 2 ’ 


r 

c) f” jc"(i + xr"-*-' dx = ” !(w ~ 1)! . 
Jo (m + n)\ 


b) 


riogx 

Jl ^ +1 


_ „-2 


dx — n 


10.16 Given that /is Riemann-integrable on [0, 1], that /is periodic with period 1, and 
that Jo fix) dx — 0. Prove that the improper Riemann integral J^ 00 x~ s f(x) dx exists 
if s > 0. Hint. Let g{x) = Jf f(t) dt and write J? x~~ s f(x) dx = J? jc“ s dg(x). 

10.17 Assume that / e R on [a 9 b] for every b > a > 0. Define g by the equation 
x ff(x) = Ji f(t) dt if x > 0, assume that the limit lim x _ + 00 g( x) exists, and denote this 
limit by B. If a and b are fixed positive numbers, prove that 


a) 


I 


— dx = g(b) - g(a ) + f — dx. 


b) lim 


c) 


T -* 


/•od 

Jl 


r bT 

im 

> + 0 ° Jar 




f(ax) - f(bx) 


dx = B log 


* + m d , 

b l t 


d) Assume that the limit lim x _ 0+ * J* /(/)/- 2 dt exists, denote this limit by A, 
and prove that 


/: 


Hex) - /(fa) dx _ AX b 


» - r «!* 

a Ja t 


e) Combine (c) and (d) to deduce 


l 


00 


Jx - (B - 4 log 5 

JC b 


and use this result to evaluate the following integrals: 


r 


00 


cos ax — cos bx 


dx 9 


r 


00 _ e -b* 


dx. 
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Lebesgue integrals 

10.18 Prove that each of the following exists as a Lebesgue integral. 

X log X 


a) 


r 


(1 + x)- 


dx , 


b) -dx {p > -1), 


/: 


c) log x log (1 + x) dx. 


d) 


r 

r 1 logo 
Jo (i - 


log X 

(1 - x) 
x) 112 


dx. 


10.19 Assume that / is continuous on [0, 1], /( 0) = 0, /'( 0) exists. Prove that the 
Lebesgue integral Jo /(*)*“ 3/2 dx exists. 

10.20 Prove that the integrals in (a) and (c) exist as Lebesgue integrals but that those in 
(b) and (d) do not. 





dx > 

dx 

1 + x* sin 2 x 9 


b) 

d) 



dx 

1 + x 2 sin 2 x 


Hint . Obtain upper and lower bounds for the integrals over suitably chosen neighbor- 
hoods of the points nit (n = 1, 2, 3, . . . ). 


Functions defined by integrals 

10.21 Determine the set S of those real values of y for which each of the following 
integrals exists as a Lebesgue integral. 


a) 

c) 



cos xy 
1 + x 2 

sin 2 xy 
x 2 


dx, 

dx. 


b) 

d) 



(x 2 + y 2 ) 1 dx, 
e~ x2 cos 2 xy dx. 


10.22 Let F(y) = e~* 2 cos 2 xy dx if ye R. Show that F satisfies the differential 

equation F'(y) + 2y F(y) = 0 and deduce that F(y) = -jV ne~ y2 . (Use the result 
Jo e~* 2 dx - iVrc, derived in Exercise 7.19.) 

10.23 Let F{y) = Jo sin xy/x(x 2 + 1) dx if y > 0. Show that F satisfies the differential 
equation F"(y) — F(y) + n/2 = 0 and deduce that F(y) — %n( 1 — e~ y ). Use this result 
to deduce the following equations, valid for y > 0 and a > 0 : 



sin xy 
x(x 2 + a 2 ) 


dx 


x sin xy 
x 2 + a 2 


dx 



(1 - e~ ay ). 


^ e ay - 9 you may use 



cos xy 
x 2 + a 2 


dx = 


sin x 
x 



ne 


-ay 


2 a 


9 


10.24 Show that ff [ Jf /(*, y) dx] dy * S? [i? Ax, y) dy] dx if 


a) Ax, y) - 


x — y 
(x + y) : 


b) Ax, y) = 


2\2 


(X 2 + y 2 ) 



Exercises 


303 


10.25 Show that the order of integration cannot be interchanged in the following integrals : 

* - y 


a) 


fLf 


(x + j>) : 


dx dy. 


b) 


j: [I 


(e~ xy - 2e~ 2xy ) dy 


> dx. 


10.26 Let f{x, y) = J* dtl[( 1 + x 2 t 2 )(l + y 2 t 2 )] if (x, y) * (0, 0). Show (by methods 
of elementary calculus) that f(x, y) = in(x + y) _1 . Evaluate the iterated integral 
Jo [Jo fix, y) dx\ dy to derive the formula: 


|*°° (arctan x) 2 

Jo X 


dx = n log 2. 


10.27 Let f(y) = J“ sin x cos xyjx dx if y > 0. Show (by methods of elementary 
calculus) that f(y) = tt/2 if 0 < y < 1 and that /(y) = Oify > 1. Evaluate the integral 
Jo f(y) dy to derive the formula 



sm ax sm x 


dx = 


na 

~2 


71 

2 


if 0 < a < 1, 


if a > 1. 


10.28 a) If s > 0 and a > 0, show that the series 



sin Irntx 


dx 


converges and prove that 


lim 

a-» + oo 



sin Irntx 



dx = 0. 


b) Let/(x) = Xn°=i sin (2nnx)jn. Show that 

^ dx = (2? r) I_1 C(2 - s) f” ^ dt 

* Jo t S 

where C denotes the Riemann zeta function. 



if 0 < s < 1, 


10.29 a) Derive the following formula for the /ith derivative of the Gamma function : 

T**\x) = f (log /)" dt {x > 0). 

Jo 

b) When x = 1, show that this can be written as follows: 

r^O) = f 1 (/ 2 + (-l)V- 1/ V , /“ 2 (log /)" dt. 

Jo 

c) Use (b) to show that r (n) (l) has the same sign as (—1)". 

In Exercises 10.30 and 10.31, T denotes the Gamma function. 

10.30 Use the result Jo e~ x2 dx = i\fn to prove that T(i) = yjn. Prove that r(/i + 1) = 

n\ and that r(/i + = (2 n)\ Vjr/4"/i! if n = 0, 1 , 2, . . . 
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10.31 a) Show that for x > 0 we have the series representation 


oo 


rw = £ 


(-D" i 


oo 


#i = 0 


n\ n 4- x 


+ £ c„xT, 


#1 = 0 


where c n = (l//i!) JJ° t~ 1 e~ t (log /)" dt. Hint: Write Jo = Jo + JS° and use 
an appropriate power series expansion in each integral, 
b) Show that the power series £*=o c n z n converges for every complex z and that 
the series £*=o [(—1)7 n\]/(n 4- z) converges for every complex z ^ 0, — 1, 
- 2 ,... 

10.32 Assume that / is of bounded variation on [0, b ] for every b > 0, and that 
\im x _> + ao f(x) exists. Denote this limit by /(oo) and prove that 


lim y I e xy f(x) dx = /(oo). 
y->0+ Jo 

Hint. Use integration by parts. 

10.33 Assume that /is of bounded variation on [0, 1]. Prove that 

lim y f x^~ x f(x) dx = /(0-f). 

y-+o+ Jo 


Measurable functions 

10.34 If / is Lebesgue-integrable on an open interval I and if f'(x) exists almost every- 
where on /, prove that /' is measurable on /. 

10.35 a) Let {$ n } be a sequence of step functions such that s n -> f everywhere on R. 
Prove that, for every real a , 

+oo)) = (J (ia 4- i, -foo 

n=lk = »t \\ n 

b) If /is measurable on R, prove that for every open subset A of R the set f~ 1 (A) 
is measurable. 

10.36 This exercise describes an example of a nonmeasurable set in R. If x and y are real 
numbers in the interval [0, 1 ], we say that x and y are equivalent, written x ~ y, whenever 
x — y is rational. The relation ~ is an equivalence relation, and the interval [0, 1 ] can 
be expressed as a disjoint union of subsets (called equivalence classes) in each of which 
no two distinct points are equivalent. Choose a point from each equivalence class and 
let E be the set of points so chosen. We assume that E is measurable and obtain a contra- 
diction. Let A = {r l9 r 2 , . . . } denote the set of rational numbers in [—1,1] and let 
E n = {r n 4- x : x e E}. 

a) Prove that each E n is measurable and that ju(E n ) = fi(E). 

b) Prove that {E u E 2 , . . . } is a disjoint collection of sets whose union contains 
[0, 1] and is contained in [— 1, 2]. 

c) Use parts (a) and (b) along with the countable additivity of Lebesgue measure 
to obtain a contradiction. 

10.37 Refer to Exercise 10.36 and prove that the characteristic function Xe > s not measur- 
able. Let / =- X E - Xj_ E where I = [0, 1]. Prove that \f\ e L(I) but that / £ M(I). 
(Compare with Corollary 1 of Theorem 10.35.) 




References 


305 


Square-lntegrable functions 

In Exercises 10.38 through 10.42 all functions are assumed to be in L 2 (/). The L 2 - norm 
||/|| is defined by the formula, ||/|| = (J 7 |/| 2 ) 1/2 . 

10.38 If lim^oo ||/ n - /|| = 0, prove that lim^o, ||/ n || = ||/||. 

10.39 If lim^oo || f n — /'ll = 0 and if lim n _ 00 / I (x:) = g(x) almost everywhere on /, prove 
that f(x) = g(x) almost everywhere on /. 

10.40 If f n /uniformly on a compact interval /, and if each f n is continuous on /, prove 
that lim__ II f n - /|| = 0. 


* 11 — ► 00 

10.41 If lim,,.^ || f n - /|| = 0, prove that lim,,.^ Sifn'ff = Uf '9 for every g in 


L 2 {I). 

10.42 If lim^^ 
Sif'ff- 


fn - /II = 0 and lim^oo ||^„ - ^|| = 0, prove that lim^a, jjf n -g n = 
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CHAPTER 11 


FOURIER SERIES 
AND FOURIER INTEGRALS 


11.1 INTRODUCTION 

In 1807, Fourier astounded some of his contemporaries by asserting that an 
“arbitrary” function could be expressed as a linear combination of sines and co- 
sines. These linear combinations, now called Fourier series, have become an 
indispensable tool in the analysis of certain periodic phenomena (such as vibra- 
tions, and planetary and wave motion) which are studied in physics and engineering. 
Many important mathematical questions have also arisen in the study of Fourier 
series, and it is a remarkable historical fact that much of the development of 
modern mathematical analysis has been profoundly influenced by the search for 
answers to these questions. For a brief but excellent account of the history of this 
subject and its impact on the development of mathematics see Reference 11.1. 


11.2 ORTHOGONAL SYSTEMS OF FUNCTIONS 

The basic problems in the theory of Fourier series are best described in the setting 
of a more general discipline known as the theory of orthogonal functions. There- 
fore we begin by introducing some terminology concerning orthogonal functions. 

note. As in the previous chapter, we shall consider functions defined on a general 
subinterval I of R. The interval may be bounded, unbounded, open, closed, or 
half-open. We denote by L 2 (7) the set of all complex-valued functions / which are 
measurable on / and are such that |/| 2 e L(I). The inner product (/, g) of two such 
functions, defined by 

if, d) = j f(xMx) dx, 

always exists. The nonnegative number ||/|| = (/,/) 1/2 is the L 2 -norm of f. 
Definition 11.1. Let S — {<p 0 , cp l , q > 2 , . . . } be a collection of functions in If (I). If 

{(p n , (p m ) = 0 whenever m # n, 

the collection S is said to be an orthogonal system on I. If, in addition, each <p„ has 
norm 1, then S is said to be orthonormal on I. 

note. Every orthogonal system for which each ||<pj # 0 can be converted into 
an orthonormal system by dividing each q> n by its norm. 
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We shall be particularly interested in the special trigonometric system 
S = i<Po> K where 


(po(x) = 



<Pi n -i(x) 


cos nx 
V n 


<P2„(X) 


sin nx 

\Jn 



for n = 1, 2, . . . It is a simple matter to verify that S is orthonormal on any 
interval of length 2n. (See Exercise 11.1.) The system in (1) consists of real-valued 
functions. An orthonormal system of complex-valued functions on every interval 
of length 2 n is given by 



cos nx + i sin nx 
\Jln 


n = 0, 1,2,... 


11.3 THE THEOREM ON BEST APPROXIMATION 

One of the basic problems in the theory of orthogonal functions is to approximate 
a given function/in L 2 (/) as closely as possible by linear combinations of elements 
of an orthonormal system. More precisely, let S = {<p 0 , q> u q > 2 , . . . } be ortho- 
normal on / and let 


n 

*n(x) = £ b k <p k (x), 

k = 0 

where b 0 , b u ...,b„ are arbitrary complex numbers. We use the norm \\f — t n \\ 
as a measure of the error made in approximating/ by t„. The first task is to choose 
the constants b 0 , . . . , b n so that this error will be as small as possible. The next 
theorem shows that there is a unique choice of the constants that minimizes this 
error. 

To motivate the results in the theorem we consider the most favorable case. 
If/is already a linear combination of <p 0 , (p t , , <p n , say 

If 

/ = 2 Wk, 

k = 0 

then the choice t„ = / will make \\f - tj = 0. We can determine the constants 
c 0 , .. . ,c„ as follows. Form the inner product (/ <p m ), where 0 < m < n. Using 
the properties of inner products we have 

(/> <Pm) = ( Z] C k<Pk > <Pm ) = 2 C k(<Pk, 9m) = C m , 

\* = 0 ) * = 0 

since ( <p k , <p m ) — 0 if k ^ m and (q> m , (p m ) =1. In other words, in the most 
favorable case we have c m = (/ <p m ) for m = 0, 1, . . . , n. The next theorem shows 
that this choice of constants is best for all functions in L 2 (7). 
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Theorem 11.2. Let {<p 0 , q> u (p 2 , . . .} be orthonormal on J, and assume that 
f e L 2 (/). Define two sequences of functions {sj and {t n } on I as follows: 


n n 

s n (x) = £ c k<Pk(x), t n (x) = X b k <p k (x), 

k=0 k = 0 

where 


C k = if, <Pk ) for k = 0, 1, 2, ... , 

and b 0 ,b 1 ,b 2 , . . . , are arbitrary complex numbers. Then for each n we have 

11/ ~sj =£ ||/- tj. 

Moreover, equality holds in (3) if, and only if, b k = c k for k = 0, 1, . . . , n. 


( 2 ) 

(3) 


Proof. We shall deduce (3) from the equation 


n 


n 


II/- tj 2 = ll/ll 2 - £) kl 2 + £ I bk- c k | 2 . 


(4) 


k=0 


k = 0 


It is clear that (4) implies (3) because the right member of (4) has its smallest value 
when b k = c k for each k. To prove (4), write 

ii/ - y 2 = if - t H ,f - o = if,n- (j, o - it„,n + it n , a 

Using the properties of inner products we find 


0n> y ( b k (p k , b m <p m j 
\k = 0 m = 0 J 


n n 


n 


= £ £ b k B m i<p k , <pj = X |h*| 2 , 


fc = 0 m = 0 


k = 0 


and 


( n \ n n 

f, £ b k <p k ) = X B k if, <p k ) = ^ B k c k . 

k = 0 / * = 0 t=0 


Also, (y /) = (/, o = £* =0 and hence 


n 


n 


n 


II/- Ml 2 = ll/ll 2 - £ b kC k - £ b k c k + I b, 


k = 0 


k = 0 


k = 0 


n 


n 


— ll/ll 2 “ l C fc| 2 + ““ C k)0>k ~ c k ) 


fc = 0 


fc = 0 


n 


n 


ll/ll 2 - £ k *| 2 + £ I bk- C k I 2 . 
fc =0 k =0 
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Properties of Fourier Coefficients 




11.4 THE FOURIER SERIES OF A FUNCTION RELATIVE TO AN 
ORTHONORMAL SYSTEM 

Definition 113. Let S = {cp 0 , (p t , q > 2 , . . . } be orthonormal on I and assume that 
f 6 L 2 (/). The notation 

00 

/(*) ~ X c n<P„(x) (5) 

n = 0 

will mean that the numbers c 0 , c u c 2 , . . . are given by the formulas: 


C n = (/» <Pn) = I f(x)<Pn( X ) dx (« = 0, 1, 2, . . .). (6) 

The series in (5) is called the Fourier series of f relative to S, and the numbers 
c 0 , c u c 2 , . . . are called the Fourier coefficients of f relative to S. 

note. When I = [0, 2n] and S is the system of trigonometric functions described 
in (1), the series is called simply the Fourier series generated by f We then write (5) 
in the form 


/(*) 



QQ 

+ ^ (a n cos nx + b n sin nx), 

n=l 


the coefficients being given by the following formulas : 

1 C 2 ” 1 C 2 * 

a n = - f(t) cos nt dt, b„ = - f(t) sin nt dt. (7) 

n Jo rcjo 

In this case the integrals for a„ and b„ exist if f e L([0, 2i t]). 


11.5 PROPERTIES OF THE FOURIER COEFFICIENTS 

Theorem 11.4. Let {cp 0 , <p u q > 2 , . . . } be orthonormal on I, assume that f 6 L 2 (7), 
and suppose that 

00 

f(x) ~ X C «<Pn(x)- 

n = 0 

Then 

a) The series X k«| 2 converges and satisfies the inequality 

00 

X l c »l 2 ^ II /II 2 {Bessel’s inequality). (8) 

n = 0 

b) The equation 

X i c »i 2 = ii/ii 2 

n = 0 


{ParseyaVs formula) 
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holds if \ and only if we also have 

lim ||/ - sj| = 0, 

n-+ oo 

where {£„} is the sequence of partial sums defined by 

n 

s«(*) = Yj Ck<Pk(x)- 

k= 0 

Proof. We take b k = c k in (4) and observe that the left member is nonnegative. 
Therefore 

S w 2 * ii/ii 2 - 

k = 0 

This establishes (a). To prove (b), we again put b k = c k in (4) to obtain 

ii/-sj 2 = ii/ii 2 - i>*i 2 - 

k = 0 

Part (b) follows at once from this equation. 

As a further consequence of part (a) of Theorem 11.4 we observe that the 
Fourier coefficients c„ tend to 0 as n -*• oo (since £ |c„| 2 converges). In particular, 

when <p„(x) = e inx [yf2n and I = [0, 2jr] we find 

f 2 * 

lim I /( x)e~ inx dx = 0, 

B -»00 J 0 

from which we obtain the important formulas 

f ’ln Pin 

f{x) cos nx dx = lim I fix) sin nx dx = 0. (9) 

_ 0 "-* 00 Jo 

These formulas are also special cases of the Riemann-Lebesgue lemma (Theorem 
11.6). 

note. The Parseval formula 

ll/ll 2 = kol 2 + Icxl 2 + |c 2 | 2 + ••• 
is analogous to the formula 

||x|| 2 = x\ + x\ + ••• + x 2 „ 

for the length of a vector x = (jc 15 . . . , jc„) in R". Each of these can be regarded 
as a generalization of the Pythagorean theorem for right triangles. 
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11.6 THE RIESZ-FISCHER THEOREM 

The converse to part (a) of Theorem 11.4 is called the Riesz-Fischer theorem. 

Theorem 11.5. Assume {<p 0 , <p lt . . .} is orthonormal on I. Let {c„} be any sequence 
of complex numbers such that £ |c t | 2 converges. Then there is a function f in L 2 (7) 
such that 

a ) if, <Pt) = c k for each k > 0, 
and 

b> ii/ii 2 = f; ic t i 2 . 

k=0 


Proof Let 

n 

S„(x) = Y C k <Pk(x)- 

k = 0 

We will prove that there is a function /in L 2 (7) such that if, (p k ) = c k and such that 

lim \\s n - /|| = 0. 

oo 

Part (b) of Theorem 11.4 then implies part (b) of Theorem 11.5. 

First we note that {s„} is a Cauchy sequence in the semimetric space L 2 (7) 
because, if m > n we have 

w w 

Ik - sj 2 = Y C fri<Pk, <Pr) 

k=n+ 1 r—n+1 

= Y 

k—n+ 1 

and the last sum can be made less than e if m and n are sufficiently large. By 
Theorem 10.57 there is a function /in L 2 (/) such that 

lim ||s n - /|| = 0. 

n-+oo 

To show that (/ (p k ) = c k we note that (s n , = c k if n > k, and use the Cauchy- 
Schwarz inequality to obtain 

k* - if, <p k )\ = Ik (pk) - if, <Pk ) I = Ik - f, <pk)\ ^ lk» - /II • 

Since ||j b — / 1| -* 0 as n -+ oo this proves (a). 

note. The proof of this theorem depends on the fact that the semimetric space 
L 2 (7) is complete. There is no corresponding theorem for functions whose squares 
are Riemann-integrable. 
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11.7 THE CONVERGENCE AND REPRESENTATION PROBLEMS FOR 
TRIGONOMETRIC SERIES 

Consider the trigonometric Fourier series generated by a function / which is 
Lebesgue-integrable on the interval / = [0, 2n], say 

fix) ~ — 4- ( a n cos nx + b n sin nx ). 

2 n= 1 

Two questions arise. Does the series converge at some point x in /? If it does 
converge at x, is its sum fix)? The first question is called the convergence problem; 
the second, the representation problem. In general, the answer to both questions 
is “No.” In fact, there exist Lebesgue-integrable functions whose Fourier series 
diverge everywhere, and there exist continuous functions whose Fourier series 
diverge on an uncountable set. 

Ever since Fourier’s time, an enormous literature has been published on these 
problems. The object of much of the research has been to find sufficient conditions 
to be satisfied by /in order that its Fourier series may converge, either throughout 
the interval or at particular points. We shall prove later that the convergence or 
divergence of the series at a particular point depends only on the behavior of the 
function in arbitrarily small neighborhoods of the point. (See Theorem 11.11, 
Riemann’s localization theorem.) 

The efforts of Fourier and Dirichlet in the early nineteenth century, followed 
by the contributions of Riemann, Lipschitz, Heine, Cantor, Du Bois-Reymond, 
Dini, Jordan, and de la Vallee-Poussin in the latter part of the century, led to the 
discovery of sufficient conditions of a wide scope for establishing convergence of 
the series, either at particular points, or generally, throughout the interval. 

After the discovery by Lebesgue, in 1902, of his general theory of measure and 
integration, the field of investigation was considerably widened and the names 
chiefly associated with the subject since then are those of Fejer, Hobson, W. H. 
Young, Hardy, and Littlewood. Fejer showed, in 1903, that divergent Fourier 
series may be utilized by considering, instead of the sequence of partial sums {.$„}, 
the sequence of arithmetic means {crj, where 

a n (x) = 5 o(*) + 5 i(*) + 

n 

He established the remarkable theorem that the sequence {cr^jt)} is convergent 
and its limit is i[/(x+) +/(*—)] at every point in [0, 2n\ where /(*+) and 
fi x ~) exist, the only restriction on / being that it be Lebesgue-integrable on 
[0, 2 tt] (Theorem 11.15.). Fej6r also proved that every Fourier series, whether it 
converges or not, can be integrated term-by-term (Theorem 11.16.) The most 
striking result on Fourier series proved in recent times is that of Lennart Carleson, 
a Swedish mathematician, who proved that the Fourier series of a function in 
L 2 (J) converges almost everywhere on I. iActa Mathematical 116 (1966), pp. 
135-157.) 
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In this chapter we shall deduce some of the sufficient conditions for convergence 
of a Fourier series at a particular point. Then we shall prove Fej6r’s theorems. 
The discussion rests on two fundamental limit formulas which will be discussed 
first. These limit formulas, which are also used in the theory of Fourier integrals, 
deal with integrals depending on a real parameter a, and we are interested in the 
behavior of these integrals as a -+ + oo. The first of these is a generalization of (9) 
and is known as the Riemann-Lebesgue lemma. 


11.8 THE RIEMANN-LEBESGUE LEMMA 


Theorem 11.6. Assume that f e L(I). Then, for each real ft, we have 


lim 

a-* + oo 



f(t) sin (at + P) dt = 0. 



Proof. Iff is the characteristic function of a compact interval [a, bj the result is 
obvious since we have 


C b 

sin (at + P) dt 


cos (aa + P) — cos (ba + p) 

a 



if a > 0. 


The result also holds if / is constant on the open interval (a, b) and zero outside 
[a, Z>], regardless of how we define f(a) and f(b). Therefore (10) is valid if/ is a 
step function. But now it is easy to prove (10) for every Lebesgue-integrable 
function /. 

If e > 0 is given, there exists a step function s such that J 7 \f — < e/2 (by 

Theorem 10.19(b)). Since (10) holds for step functions, there is a positive M such 
that 


J/ 


s(t) sin (at 4- j 9) dt 


< - if a > M. 
2 


Therefore, if a > M we have 


1 


f(t) sin (at + P) dt 


< (/( 0 - s(t)) sin (at + 0) dt 

Jf 


+ 


l 


s(t) sin (at + p) dt 


‘I 


I f(t) ~ 5(01 dt + J < ^ + l 

2 2 2 


= 8 . 


This completes the proof of the Riemann-Lebesgue lemma. 
Example. Taking 0 = 0 and 0 = n/ 2, we find, if /e £(/), 


lim f 
. + 00 Jj 


lim j /(/) sin at dt = 

at-* 


dt = lim I 

*- + °° J/ 


/(/) cos at dt = 0. 
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As an application of the Riemann-Lebesgue lemma we derive a result that will 
be needed in our discussion of Fourier integrals. 

Theorem 11.7 . Iffe L(— oo, + oo), we have 

f ( t ) 1 - cos gt dt = rm-fi-t) 


r°° 

im I 

> + 0 ° J-oo 


lim 

a-* 


t 


r 


dt. 


(ID 


whenever the Lebesgue integral on the right exists. 


Proof. For each fixed a, the integral on the left of (11) exists as a Lebesgue 
integral since the quotient (1 — cos a t)/t is continuous and bounded on 
(— oo, + oo ). (At t = 0 the quotient is to be replaced by 0, its limit as t -+ 0.) 
Hence we can write 


•oo 

m 

J — 00 


1 


cos at 


dt 


t 


r 

/•oo 

Jo 

/• 00 

Jo 


At) 


l 


cos at 


t 


dt + 


•o 

V 


At) 


l 


cos at 


dt 


00 


t 




1 


cos at 


dt 


t 


At) -A-t) 


dt 


/• 00 

Jo 


At) -A-t) 


cos at dt. 


t Jo t 

When a -*• + oo, the last integral tends to 0, by the Riemann-Lebesgue lemma. 


11.9 THE DIRICHLET INTEGRALS 


Integrals of the form J$ 0(O( s > n dt (called Dirichlet integrals) play an im- 
portant role in the theory of Fourier series and also in the theory of Fourier 
integrals. The function g in the integrand is assumed to have a finite right-hand 
limit 0(0+) = lim,_ 0+ g(t) and we are interested in formulating further con- 
ditions on g which will guarantee the validity of the following equation : 

lim - f g(t ) 5L5? dt = 0(0 +). (12) 

a-* + oo 7t j q t 


To get an idea why we might expect a formula like (12) to hold, let us first consider 
the case when g is constant (g(t) = 0(0+)) on [0, <5]. Then (12) is a trivial con- 
sequence of the equation Jo (sin t)/t dt = n/2 (see Example 3, Section 10.16), 
since 

9 sin at 
o t 

More generally, if g e L([0, 5]), and if 0 < e < 8, we have 


dt = 


C ai sin t 

Jo t 


dt 


n 

2 


as a 


+ 00 . 



lim 

a-+ + oo 




t 
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by the Riemann-Lebesgue lemma. Hence the validity of ( 12 ) is governed entirely 
by the local behavior of g near 0. Since g(t) is nearly #( 0 +) when t is near 0, there 
is some hope of proving ( 12 ) without placing too many additional restrictions on g. 
It would seem that continuity of g at 0 should certainly be enough to insure the 
existence of the limit in (12). Dirichlet showed that continuity of g on [0, 5] is 
sufficient to prove ( 12 ), if, in addition, g has only a finite number of maxima or 
minima on [0, &]. Jordan later proved (12) under the less restrictive condition 
that g be of bounded variation on [0, &]. However, all attempts to prove (12) under 
the sole hypothesis that g is continuous on [0, <5] have resulted in failure. In fact, 
Du Bois-Reymond discovered an example of a continuous function g for which the 
limit in ( 12 ) fails to exist. Jordan’s result, and a related theorem due to Dini, will 
be discussed here. 

Theorem 11.8 ( Jordan). If g is of bounded variation on [0, 5], then 

lim -f g(t)^-^ dt = g(0+). (13) 

a-* + oo 7C J 0 t 

Proof. It suffices to consider the case in which g is increasing on [0, 5]. If a > 0 
and if 0 < h < 8, we have 

f a sin at , C h r , . sin at , 

g(t ) dt = [ g(t ) - 0 ( 0 +)] dt 

Jo t Jo t 

+ 3 ( 0 +) dt + 0 (t) dt 

Jo * J* t 

= hi*, h) + / 2 ( a, h ) + I 3 (a, h ), (14) 

let us say. We can apply the Riemann-Lebesgue lemma to I 3 ( a, h ) (since the 
integral $ g(t)/t dt exists) and we find I 3 (a, h) -*■ 0 as a -+ + oo. Also, 

I 2 (a, h) = 0 ( 0 +) f ^ dt 

Jo t 

= 3 ( 0 +) j* dt -* - 0 ( 0 +) as a -► +oo. 

Jo t 2 

Next, choose M > 0 so that |JJ (sin t)/t dt\ < M for every b > a ^ 0. It follows 
that |JJ (sin a t)/t dt\ < M for every 6 >a^ 0 if a>0. Now let e > 0 be 
given and choose h in (0, 8) so that \g(h) - 0 (O+)| < s/( 3 Af). Since 

3(0 - 3 ( 0 +) >0 if 0 < t <, h, 
we can apply Bonnet’s theorem (Theorem 7.37) in 7 X ( a, h) to write 

*»(«, h) = P [0(0 - 3(0+)] — dt = [ 0 (A) - 0(0+)] P dt, 

JO t J c t 
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where c e [0, K\. The definition of h gives us 

U,(«, h ) | = | g(h) - 0(O+)| I f — dt < -1- M = ^ . (15) 

I Jc t 3 M J 

For the same h we can choose A so that a ^ A implies 

|/ 3 (a, h) | < i and J 2 (a, h) - - g(0+) < | . (16) 

•/ Jm* *2 

Then, for a k A, we can combine (14), (15), and (16) to get 

sin at 7t /A , . 

9(0 dt - - 0(0+) < e. 

Jo t 2 

This proves (13). 

A different kind of condition for the validity of (13) was found by Dini and 
can be described as follows : 


Theorem 11.9 (Dini). Assume that 0(0+) exists and suppose that for some 6 > 0 
the Lebesgue integral 

r 9(t) - g(0+) 


exists. Then we have 


Proof. Write 


lim - 


a-* + oo 


2 r* 

n Jo 


0(f) — * dt = 0(0 +). 
t 


sin <xt ,, 
0(f) dt 

t 


-r 


sin at dt + 0(0+) 
f 


r ^ dt. 

Jo * 


When a -*■ + oo, the first term on the right tends to 0 (by the Riemann— Lebesgue 
lemma) and the second term tends to ^ng(0+). 

note. If 0 e L([a, 5 ]) for every positive a < 5, it is easy to show that Dini’s 
condition is satisfied whenever g satisfies a “right-handed” Lipschitz condition at 
0; that is, whenever there exist two positive constants M and p such that 

|0(f) — 0(O+)| < Mt p , for every f in (0, 5]. 


(See Exercise 11.21.) In particular, the Lipschitz condition holds with p — 1 
whenever g has a righthand derivative at 0. It is of interest to note that there exist 
functions which satisfy Dini’s condition but which do not satisfy Jordan’s con- 
dition. Similarly, there are functions which satisfy Jordan’s condition but not 
Dini’s. (See Reference 11.10.) 
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11.10 AN INTEGRAL REPRESENTATION FOR THE PARTIAL SUMS OF A 
FOURIER SERIES 

A function / is said to be periodic with period p # 0 if / is defined on R and if 
f(x + p) = f(x) for all x. The next theorem expresses the partial sums of a 
Fourier series in terms of the function 


n 


D„(t) = i + T. cos kt = 

*=i 


sin(n + %)t 
2 sin t/2 

n + i 


if t ¥= 2mn (m an integer), 

(17) 

if t = 2mn ( m an integer). 


This formula was discussed in Section 8.16 in connection with the partial sums of 
the geometric series. The function D„ is called Dirichlet's kernel. 


Theorem 11.10. Assume that f e L([0, 2nf) and suppose that f is periodic with 
period 2n. Let {s B } denote the sequence of partial sums of the Fourier series generated 
by f say 

n 

s„(x ) = ^ ^ (a k cos kx + b k sin kx), (n = 1, 2, . . .). (18) 

2 k=l 


Then we have the integral representation 



% n 

Jo 


n f( x + 0 + f(x — 0 


D n (t) dt . 



Proof. The Fourier coefficients of / are given by the integrals in (7). Substituting 
these integrals in (18) we find 


5 n(*) = - f f(t) {- + S ( cos kt cos kx + sin kt sin kx) \ dt 
n Jo \2 k^i 

= - I* f(0 {^ + S cos k(t - x)ldt = - f f(t)D n (t - x) dt. 

71 Jo (.2 *= i J n Jo 

Since both / and D„ are periodic with period 2n, we can replace the interval of 
integration by [x — n, x + jt] and then make a translation u = t — x to get 

s n (x) = 1 \f(t)D n (t - x) dt 
njx-« 

= - | " f(x + u)D n (u) du. 

* J-* 

Using the equation D„( — u) = D n (u), we obtain (19). 
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11.11 REEMANN’S LOCALIZATION THEOREM 


Formula (19) tells us that the Fourier series generated by / will converge at a point 
x if, and only if, the following limit exists : 


lim - 

n-+ oo 71 


"fix + t) + fix — t ) sin ( n + 


o 


2 sin %t 


dt. 



in which case the value of this limit will be the sum of the series. This integral is 
essentially a Dirichlet integral of the type discussed in the previous section, except 
that 2 sin it appears in the denominator rather than t. However, the Riemann- 
Lebesgue lemma allows us to replace 2 sin it by t in (20) without affecting either 
the existence or the value of the limit. More precisely, the Riemann-Lebesgue 
lemma implies 


lim 2 r 

n-*oo n Jo 


1 _ 1 
t 2 sin it 


f{x + Q + fjx - Q 
2 


sin (rt + i)t dt = 0, 


because the function F defined by the equation 


1 


1 


if 0 < t < n. 


Fit ) = { t 2 sin it 

0 if t = 0, 

is continuous on [0, n]. Therefore the convergence problem for Fourier series 
amounts to finding conditions on / which will guarantee the existence of the 
following limit : 

fix + t) + fix — t) sin in + i)t 


lim ^ p 

n-+oo 71 Jo 


dt. 


t 


( 21 ) 


Using the Riemann-Lebesgue lemma once more, we need only consider the limit 
in (21) when the integral Jj is replaced by |q» where 8 is any positive number <n, 
because the integral tends to 0 as n -*■ oo. Therefore we can sum up the results 
of the previous section in the following theorem : 

Theorem 11.11. Assume that f e L([0, 2nJ) and suppose f has period 2 n. Then 
the Fourier series generated by f will converge for a given value of x if, and only if, 
for some positive 8 < n the following limit exists: 


lim - 


n-+ao 


2 p 

* Jo 


f(x ■+■ t) ■+■ — t) sin ( [fl -f- \ )t 


t 


dt , 


( 22 ) 


in which case the value of this limit is the sum of the Fourier series. 


This theorem is known as Riemanris localization theorem. It tells us that the 
convergence or divergence of a Fourier series at a particular point is governed 
entirely by the behavior of / in an arbitrarily small neighborhood of the point. 
This is rather surprising in view of the fact that the coefficients of the Fourier 
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series depend on the values which the function assumes throughout the entire 
interval [0, In']. 


11.12 SUFFICIENT CONDITIONS FOR CONVERGENCE OF A FOURIER 
SERIES AT A PARTICULAR POINT 

Assume that / e L([0, 2n ]) and suppose that / has period 2n. Consider a fixed x 
in [0, 2 7t] and a positive S < n. Let 


and let 


g(t) = /(* + 0 + /(* ~ 0 


if t e [0, 5], 


s(x) = 0 ( 0 +) = 


lim + 0 + /(* ~ 0 

r->o + 2 


whenever this limit exists. Note that s(x) = f{x) if/ is continuous at x. 

By combining Theorem 11.11 with Theorems 11.8 and 11.9, respectively, we 
obtain the following sufficient conditions for convergence of a Fourier series. 

Theorem 11.12 ( Jordan’s test). Iffis of bounded variation on the compact interval 
[x — S, x + 5] for some S < n, then the limit j(x) exists and the Fourier series 
generated by f converges to s( x). 

Theorem 11.13 (Dim’s test). If the limit s(x) exists and if the Lebesgue integral 

r ? (,) - **> dt 
Jo t 

exists for some 5 < n, then the Fourier series generated by f converges to s(x). 


11.13 CESARO SUMMABILITY OF FOURIER SERIES 


Continuity of a function / is not a very fruitful hypothesis when it comes to 
studying convergence of the Fourier series generated by /. In 1873, Du Bois- 
Reymond gave an example of a function, continuous throughout the interval 
[0, 2n], whose Fourier series fails to converge on an uncountable subset of [0, 2n], 
On the other hand, continuity does suffice to establish Cesaro summability of the 
series. This result (due to Fejdr) and some of its consequences will be discussed 
next. 

Our first task is to obtain an integral representation for the arithmetic means 
of the partial sums of a Fourier series. 


Theorem 11.14. Assume that f e L([0, 2re]) and suppose that f is periodic with 
period 2 n. Let s„ denote the nth partial sum of the Fourier series generated by f and 
let 


ofx) 


s 0 (x) + s,(x) + • • • + s„_i(x) 


n 


(n = 1, 2, . . .). 


(23) 
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Then we have the integral representation 


Onix) 


nn 


o 


fix + t) + f(x — t) sin 2 ^nt 
2 sin 2 y 


dt. 



Proof. If we use the integral representation for s„(jc) given in (19) and form the 
sum defining o n {x), we immediately obtain the required result because of formula 
(16), Section 8.16. 


note. If we apply Theorem 1 1 . 14 to the constant function whose value is 1 at each 
point we find o„(x) = s n (x) = 1 for each n and hence (24) becomes 


1 f HlliSf * , 1. 

nn Jo sin 2 y 


Therefore, given any number s, we can combine (25) with (24) to write 




f(x + t) + f(x - t) 
2 


s 


sin 2 \nt 
sin 2 y 


dt. 



If we can choose a value of s such that the integral on the right of (26) tends to 0 
as n -*■ oo, it will follow that o„(x) -*■ s as n -*■ oo. The next theorem shows that it 
suffices to take s = [/(*+) + f(x— )]/2. 


Theorem 11.15 (Fejir). Assume that f e L([0, 2nJ) and suppose that f is periodic 
with period 2n. Define a function s by the following equation: 



jj m fix + 0 + f( x — t) 



whenever the limit exists. Then , for each x for which s(x) is defined, the Fourier 
series generated by f is Cesdro summable and has (C, 1) sum s(x). That is, we have 


lim a„(x) = s(x), 

n~* oo 

where {o„} is the sequence of arithmetic means defined by (23). If in addition, f is 
continuous on [0, 2n\, then the sequence {<x„} converges uniformly to f on [0, 2n]. 


Proof. Let g x (t) = [/(* + t) + fix — /)]/ 2 - s(x), whenever s(x) is defined. 
Then g x (t) 0 as t -*■ 0+. Therefore, given e > 0, there is a positive S < n 
such that \g x it)\ < e/2 whenever 0 < t < 6. Note that 8 depends on x as well as 
on e. However, if / is continuous on [0, 2ri], then / is uniformly continuous on 
[0, 2 ti], and there exists a 8 which serves equally well for every x in [0, 2n]. Now 
we use (26) and divide the interval of integration into two subintervals [0, 5] and 
[5, 7t]. On [0, 5] we have 

. I 1 C s sin 2 \nt ,, ^ e f* sin 2 \nt , e 
n7rl 0 sin 2 2nn J 0 sin 2 \t 2 
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because of (25). On [<5, re] we have 


|JL f* 

\nn]a 


sin 2 int . 
9x(t) , , dt 

snr %t 


nn sin 


j_r 

in 2 I, 


15,(01 dt <, 


I(x) 


nn sin 2 


where I{x) = ft \g x (t)\ dt. Now choose N so that I(x)/(N re sin 2 \6) < e/2, 
n > N implies 


Then 


<r«(x) ~ s(x)l = 



< e. 


In other words, o„(x) — *• s(x) as n -*■ oo. 

If/ is continuous on [0, 2re], then, by periodicity, /is bounded on R and there 
is an M such that \g x {t)\ < M for all x and t, and we may replace I(x) by nM in 
the above argument. The resulting N is then independent of x and hence o„->s = / 
uniformly on [0, 2re]. 


11.14 CONSEQUENCES OF FEJER’S THEOREM 

Theorem 11.16. Let f be continuous on [0, 2re] and periodic with period In. Let 
{s„} denote the sequence of partial sums of the Fourier series generated by f say 

QO 

/(*) ~ (°n cos nx + b„ sin nx). (28) 

2 n= 1 

Then we have: 


a) l.i.m.,,.,^ s„ = f on [0, 2re]. 

b) - j l/(x)| 2 dx = ^ + Y, (aj + b 2 ) ( Parseval’s formula), 

ft Jo 2 n=l 

c) The Fourier series can be integrated term by term. That is, for all x we have 

(*x 

f(t) dt = ^ + ]C (°" cos nt + b n sin «0 dt, 

Jo 2 n=l J 0 

the integrated series being uniformly convergent on every interval, even if the 
Fourier series in (28) diverges. 

d) If the Fourier series in (28) converges for some x, then it converges to f{x). 

Proof. Applying formula (3) of Theorem 1 1.2, with t„(x) = o„(x) = (lAOX^o^M, 
we obtain the inequality 



!/(*) ~ s n (x)| 2 dx < 



!/(*) ~ ff„(x)| 2 dx. 



But, since o H /uniformly on [0, 2re], it follows that l.i.m.,,^ o n = f on [0, 2re], 
and (29) implies (a). Part (b) follows from (a) because of Theorem 1 1 .4. Part (c) 
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also follows from (a), by Theorem 9.18. Finally, if {5„(x)} converges for some x, 
then {<r„(x)} must converge to the same limit. But since <r„(x) -*■ f(x) it follows 
that 5 n (x) -*■ f(x), which proves (d). 


11.15 THE WEIERSTRASS APPROXIMATION THEOREM 

Fejer’s theorem can also be used to prove a famous theorem of Weierstrass which 
states that every continuous function on a compact interval can be uniformly 
approximated by a polynomial. More precisely, we have : 

Theorem 11.17. Let f be real-valued and continuous on a compact interval [a, b]. 
Then for every e > 0 there is a polynomial p ( which may depend on e ) such that 

\f(x) — p(x)| < e for every x in [a, b], (30) 

Proof If t e [0, 7t), let g(t) = f\a + t(b — a)/n] ; if t e [n. In'], let g(t) = 
f\a + (2n — t)(b — a)/n] and define g outside [0, 2n] so that g has period 2n. 
For the e given in the theorem, we can apply Fejdr’s theorem to find a function o 
defined by an equation of the form 

,N 

o(t) = A 0 + 7. ( A k cos kt + B k sin kt) 

k=l 

such that \g(t) — <x(f)l < e/2 for every t in [0, 2ii\. (Note that N, and hence o, 
depends on e.) Since a is a finite sum of trigonometric functions, it generates a 
power series expansion about the origin which converges uniformly on every finite 
interval. The partial sums of this power series expansion constitute a sequence of 
polynomials, say {/?„}, such that p n -*■ o uniformly on [0, 2n]. Hence, for the 
same e, there exists an m such that 

\p m (t) - <x(OI < ^ > for every t in [0, 2n]. 

Therefore we have 

\Pm(0 - 0(01 < 8, for every / in [0, 27t]. (31) 

Now define the polynomial p by the formula p(x) = p m [n(x — a)/(b — a)]. Then 
inequality (31) becomes (30) when we put t = n(x — a)/(b — a). 


11.16 OTHER FORMS OF FOURIER SERIES 
Using the formulas 

2 cos nx = e wx + e~ ,nx and 2i sin nx = e mx — e~' nx , 

the Fourier series generated by / can be expressed in terms of complex exponentials 
as follows: 

oo oo 

~ + X (°» cos nx + b » sin «x) = ^ + 23 ('V'"* + Pne~ inx \ 

2 " 2 n =i 


fix) 





Fourier Integral Theorem 


323 


where ot„ = (a„ - ibJ/2 and /?„ = (a„ + ib n )/2. If we put a 0 = a 0 /2 and a_ B = f}„, 
we can write the exponential form more briefly as follows : 


00 

fix) ~ Y a n einx - 

n= — oo 

The formulas (7) for the coefficients now become 

a„ = — (* dt {n = 0, ±1, +2, . . .). 

2n Jo 

If/has period 27r, the interval of integration can be replaced by any other interval 
of length 2n. 

More generally, if/e L([0, />]) and if /has period p, we write 


fV>~* + ±U 

2 n = i y 


2nnx , . 2nnx 

cos 1- b„ sin 


P P 

to mean that the coefficients are given by the formulas 

2 C p .. . 2i int , 

a« = - fit) cos. dt, 


f: 

■■ - 


p . . 2 nnt . 
f(t) sin dt 


(n = 0, 1, 2, . . .). 


In exponential form we can write 


fix) ~ Y * n e 2 * inxlp , 

n= — oo 

where 

a„ = - \ fit)e~ 2nb,,lp dt, if n = 0, ± 1, ±2, . . . . 

P Jo 

All the convergence theorems for Fourier series of period 27r can also be applied 
to the case of a general period p by making a suitable change of scale. 


11.17 THE FOURIER INTEGRAL THEOREM 

The hypothesis of periodicity, which appears in all the convergence theorems 
dealing with Fourier series, is not as serious a restriction as it may appear to be at 
first sight. If a function is initially defined on a finite interval, say [a, fi], we can 
always extend the definition of/outside [a, 6] by imposing some sort of periodicity 
condition. For example, if f(a) = f(b), we can define / everywhere on ( — oo, + oo) 
by requiring the equation f(x + p) = fix) to hold for every x, where p = b — a. 
(The condition /(a) = fib) can always be brought about by changing the value 
of/ at one of the endpoints if necessary. This does not affect the existence or the 
values of the integrals which are used to compute the Fourier coefficients of /.) 
However, if the given function is already defined everywhere on ( — oo,+ oo) and 
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is not periodic, then there is no hope of obtaining a Fourier series which represents 
the function everywhere on ( — oo, + oo). Nevertheless, in such a case the function 
can sometimes be represented by an infinite integral rather than by an infinite series. 
These integrals, which are in many ways analogous to Fourier series, are known as 
Fourier integrals, and the theorem which gives sufficient conditions for representing 
a function by such an integral is known as the Fourier integral theorem. The basic 
tools used in the theory are, as in the case of Fourier series, the Dirichlet integrals 
and the Riemann-Lebesgue lemma. 

% 

Theorem 11.18 (Fourier integral theorem). Assume that f e L(— oo, + oo). Suppose 
there is a point x in R and an interval [x — 5, x + 5] about x such that either 

a) f is of bounded variation on [x — 5, x + 5], 
or else 

b) both limits /(*+) and f(x—) exist and both Lebesgue integrals 

p &± o-A ?i ) 4 md p/(x - Q -/(*-)„ 

Jo l Jo l 

exist . 


Then we have the formula 

/(*+) + 

2 




f(u) cos v(u — x) du 


1 dv. 


(32) 


the integral jo being an improper Riemann integral. 

Proof. The first step in the proof is to establish the following formula : 


1 C 00 
lim - 

+ oo 7T _ 

v 


fix + 0 — * 

t 2 


(33) 


For this purpose we write 


sin at . 
f(x + t ) dt 

nt 


. r\p + r + r 

J — oo J ~3 JO J 3 


When a -*■ + oo, the first and fourth integrals on the right tend to 0, because of 
the Riemann-Lebesgue lemma. In the third integral, we can apply either Theorem 
11.8 or Theorem 11.9 (depending on whether (a) or (b) is satisfied) to get 


♦a 

lim 

'■'* + 00 Jo 


nt 2 


Similarly, we have 


i: 


sin at 


fix +■ 0 dt = 
nt 


=i: 


/(* - 0 52-1' it f^=i 

7 It 2 


as a 


+ 00 . 
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Thus we have established (33). If we make a translation, we get 


00 


00 


. sin oit , 

f(x + t) dt = 

t 


00 


, sin a(« — x) , 
/(«) du. 


— 00 


u — X 


and if we use the elementary formula 


sin ol(u — x) __ %a 
u - x ^ o 


cos v(u — x) dv , 


the limit relation in (33) becomes 


lim - 

a-* + oo 


ll 


00 


00 


[i: 


/(H) I cos v(u — x) dv 


' du = 


f(x + ) + f(x~) 


(34) 


But the formula we seek to prove is (34) with only the order of integration reversed. 
By Theorem 10.40 we have 


m: 


f(u) cos v(u — x) du dv = 


"1 f* oo r i*<x 

dv = 

_ J — oo _ 4 0 


/(u) cos i?(ii — x) dv du 


i 


for every a > 0, since the cosine function is everywhere continuous and bounded. 
Since the limit in (34) exists, this proves that 


lim i fT r 

-+4-00 TTjo LJ — 00 


f(u) cos i?(u — x) du dv 




__ /( x + ) + /(* — ) 


By Tlieorem 10.40, the integral J"oo f(u) cos v(u — x ) rfu is a continuous function 
of v on [0, a], so the integral Jo * n (32) exists as an improper Riemann integral. 
It need not exist as a Lebesgue integral. 


11.18 THE EXPONENTIAL FORM OF THE FOURIER INTEGRAL THEOREM 

Theorem 11.19 . If f satisfies the hypotheses of the Fourier integral theorem , then 
we have 


/(* + )+/(*-) 1 


2 71 


lim f f 

■ + + 0 ° J -a L J “oo 


f(u)e Mu ~ x) du 


J dv. 


(35) 


Proof. Let F(v) = cos v(u — x) du. Then F is continuous on 

(—oo, +oo), F(v) = F(-v) and hence a F(v) dv = |o F(—v)dv = j„F(v)dv. 
Therefore (32) becomes 


/(*+>+/(* ) = lim i r m dv 

* + 00 J o 


lim — f F(v) dv. 
a-* + oo 2.71 J 


( 36 ) 
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f(u) sin v(u — x) 


du. 


Then G is everywhere continuous and G(v) = —G(—v). Hence G(v) dv = 0 
for every a, so lim B _* + 00 Ji a G(v) dv = 0. Combining this with (36) we find 


/(*+) + f(x~) 
2 



+ iG(p)} dv. 


This is formula (35). 


11.19 INTEGRAL TRANSFORMS 


Many functions in analysis can be expressed as Lebesgue integrals or improper 
Riemann integrals of the form 

g(y) = I K(x, y)f(x) dx. (37) 

A function g defined by an equation of this sort (in which y may be either real or 
complex) is called an integral transform of /. The function K which appears in the 
integrand is referred to as the kernel of the transform. 

Integral transforms are employed very extensively in both pure and applied 
mathematics. They are especially useful in solving certain boundary value prob- 
lems and certain types of integral equations. Some of the more commonly used 
transforms are listed below : 


Exponential Fourier transform : 
Fourier cosine transform : 

Fourier sine transform : 

Laplace transform : 

Mellin transform : 


OO 


J 

J — 00 
/•oo 

J C 


e~ ixy f(x) dx. 


cos xy f(x) dx. 


o 


f 

i 

£ 


sin xy f{x) dx. 


e Xy f(x) dx. 


x y '/(x) dx. 


Since e~ ,xy = cos xy — i sin xy, the sine and cosine transforms are merely 
special cases of the exponential Fourier transform in which the function / vanishes 
on the negative real axis. The Laplace transform is also related to the exponential 
Fourier transform. If we consider a complex value of y, say y = u + iv, where 
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u and v are real, we can write 



e~ xy f(x) dx = 


°° e- ixv e~ xu f(x) dx = 
o 


r e~ ixv 4> u {x) dx, 


o 


where <fi u (x) = e Xu f(x). Therefore the Laplace transform can also be regarded 
as a special case of the exponential Fourier transform. 


note. An equation such as (37) is sometimes written more briefly in the form 
g = X{f) or g = Xf, where X denotes the “operator” which converts /into g. 
Since integration is involved in this equation, the operator X is referred to as an 
integral operator. It is clear that X is also a linear operator. That is, 


X («i/i + # 2 / 2 ) — a 1 X fi + a 2 X f 2 . 


if <*! and a 2 are constants. The operator defined by the Fourier transform is often 
denoted by 3F and that defined by the Laplace transform is denoted by if. 


The exponential form of the Fourier integral theorem can be expressed in 
terms of Fourier transforms as follows. Let g denote the Fourier transform of /, 
so that 




dt. 



Then, at points of continuity of /, formula (35) becomes 

fix) = lim j g(u)e> xu du, 

4- 00 2.71 j — & 



and this is called the inversion formula for Fourier transforms. It tells us that a 
continuous function / satisfying the conditions of the Fourier integral theorem is 
uniquely determined by its Fourier transform V- 


note. If & denotes the operator defined by (38), it is customary to denote by 1 
the operator defined by (39). Equations (38) and (39) can be expressed symbolically 
by writing g = 2Ff and / = X~ l g. The inversion formula tells us how to solve 
the equation g = !Ffiox f in terms of g. 


Before we pursue the study of Fourier transforms any further, we introduce a 
new notion, the convolution of two functions. This can be interpreted as a special 
kind of integral transform in which the kernel K(x, y) depends only on the difference 
x — y. 


11.20 CONVOLUTIONS 


Definition 11.20. Given two functions f and g, both Lebesgue integrable on 
( — 00 , + 00 ), let S denote the set of x for which the Lebesgue integral 


h(x) = 


f{t)g{x - t) dt 


(40) 
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exists. This integral defines a function h on S called the convolution of f and g. We 
also write h = f*gto denote this function. 

note. It is easy to see (by a translation) that f * g = g * f whenever the integral 
exists. 


An important special case occurs when both / and g vanish on the negative real 
axis. In this case, g{x — t) = 0 if t > x, and (40) becomes 



m x 

f(t)g(x - t) dt. 
o 



It is clear that, in this case, the convolution will be defined at each point of an 
interval [a, b] if both / and g are Riemann-integrable on [a, £>]. However, this 
need not be so if we assume only that / and g are Lebesgue integrable on [a, b\ 
For example, let 

/( t) = \ and g(t ) = * — , if 0 < t < 1, 

yjt VI - t 

and let /(f) = g(t) = 0 if r < 0 or iff > 1. Then /has an infinite discontinuity at 
t = 0. Nevertheless, the Lebesgue integral J® x f(t)dt = f _1/2 dt exists. 

Similarly, the Lebesgue integral g(t) dt = (1 - t)~ l/2 dt exists, although 

g has an infinite discontinuity at t — 1 . However, when we form the convolution 
integral in (40) corresponding to x = 1 , we find 


f(t)g( 1 - t)dt = 

J — 00 



Observe that the two discontinuities of / and g have “coalesced” into one dis- 
continuity of such nature that the convolution integral does not exist. 

This example shows that there may be certain points on the real axis at which 
the integral in (40) fails to exist, even though both / and g are Lebesgue-integrable 
on ( oo, +oo). Let us refer to such points as “singularities” of h. It is easy to 
show that such singularities cannot occur unless both f and g have infinite dis- 
continuities. More precisely, we have the following theorem : 


Theorem 11.21. Let R = ( - oo, + oo)'. Assume that fe L( R), g e L{ R), and that 
either f or g is bounded on R. Then the convolution integral 

Hx) = f{t)g{x - t) dt (42) 

J — OO 

exists for every x in R, and the function h so defined is bounded on R. If in addition , 
the bounded function f or g is continuous on R, then h is also continuous on R and 
h e L(R). 
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Proof. Since f * g = g *f it suffices to consider the case in which g is bounded. 
Suppose \g\ < M. Then 

I f(t)g{x - 01 < (43) 


The reader can verify that for each x, the product f(t)g(x — /) is a measurable 
function of t on R, so Theorem 10.35 shows that the integral for h(x) exists. The 
inequality (43) also shows that |/i(jc)| < M J |/|, so h is bounded on R. 

Now if g is also continuous on R, then Theorem 10.40 shows that h is continuous 
on R. Now for every compact interval [a, b~\ we have 



\h(x)\ dx < 


< 


J* jj* 1/(01 1* - 01 

J” 1/(01 [T \g(* - 01 
J* 1/(01 [j |0 (t)I dt 

f 


oo 


*oo 

J — oo 


1/(01 dt |£(t)I dy. 


dx 

dt 


so, by Theorem 10.31, h e L(R). 


Theorem 11.22. Let R = (— oo, +oo). Assume that /el 2 ( R) and g e L 2 (R). 
Then the convolution integral (42) exists for each x in R and the function h is bounded 
on R. 


Proof. For fixed x, let g x (t) = g(x — t). Then g x is measurable on R and 
g x € L 2 ( R), so Theorem 10.54 implies that the product f % g x e L( R). In other words, 
the convolution integral h(x) exists. Now h(x) is an inner product, h(x) = (/, g x ), 
hence the Cauchy-Schwarz inequality shows that 

\h(x)\ < ||/|| II^J = H/ll \\g\\, 

so h is bounded on R. 


11.21 THE CONVOLUTION THEOREM FOR FOURIER TRANSFORMS 

The next theorem shows that the Fourier transform of a convolution f * g is the 
product of the Fourier transforms of / and of g. In operator notation, 

F(f*g) = &(f) • 3F(g). 

Theorem 11.23. Let R = (— oo, + oo). Assume that f e L(R), g e L(R), and that 
at least one of f or g is continuous and bounded on R. Let h denote the convolution. 
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h = f *g. Then for every real u we have 


fee 

J — 00 


h(x)e~ ixu dx = 


00 


f(t)e~ i,u dt 


J — 00 



00 


— 00 


9 (y)e- iyu dy). 


(44) 


The integral on the left exists both as a Lebesgue integral and as an improper 
Riemann integral. 

Proof Assume that g is continuous and bounded on R. Let {a n } and {£„} be two 
increasing sequences of positive real numbers such that a n -> + oo and b„ + oo . 
Define a sequence of functions {/,} on R as follows : 


m = 


't>„ 
J ~On 


e tux g(x — t) dx. 


Since 


I 


e ,ux g(x — f)| dt < 


00 


— 00 


\9\ 


for all compact intervals [a, h], Theorem 10.31 shows that 


f 00 

W = 

J — 00 


lim f„(t) = 

it-* 00 


e ,ux g(x — t) dx for every real t. 


(45) 


The translation y = x — t gives us 


* 00 
J — 00 


e~' ux g(x - t)dx = 


r oo 

= e~ iut 

J — 00 


e ,uy 9(y)dy. 


and (45) shows that 


lim f(t)f n (t) = f(t)e~‘ 


oo 

— tut l I ^ — tuy 


n-» oo 


(i: 


e'"' 9(y) dy 


00 


for all t. Now/, is continuous on R (by Theorem 10.38), so the product /•/„ is 
measurable on R. Since " 

l/(0/„(0l < 1/(01 I \g\. 


*00 

- oo 


the product /•/, is Lebesgue-integrable on R, and the Lebesgue dominated con- 
vergence theorem shows that 


fco 

lim 

| - ,0 ° J-00 


lim fU)fn(t) dt = 


00 


f(t)e~ iut dt 


00 



00 


— 00 


e~ iuy 9(y) dy). (46) 


But 


|* 00 
J — oo 


mm dt = 


f°° r c b « 

-J /(,) 

J - oo L J -a 


e iux g(x - t ) dx dt. 

an 
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Since the function k defined by k(x 9 t) = g(x — /) is continuous and bounded 
on R 2 and since the integral J* e~ lux dx exists for every compact interval [a 9 6], 
Theorem 10.40 permits us to reverse the order of integration and we obtain 


f 


oo 


oo 


/(0/ B (0 dt = j e 

-a n 
b„ 


m b„ r poo 

e~ iux 

J On _ «. “00 


f(t)g(x — t) dt dx 




rb n 

J -o„ 


e lux h(x) dx. 


Therefore, (46) shows that 

rbn 

lim I h(x)e 


— tux 


n " +0 ° J — On 


( m oo 

J ~ oo 


f(t)e~ iut dt 



00 


— oo 


g(y)e-^dy), 


) 


which proves (44). The integral on the left also exists as an improper 
Riemann integral because the integrand is continuous and bounded on R and 
J* \h(x)e~ iux \ dx < \h\ for every compact interval [a, b\ 


As an application of the convolution theorem we shall derive the following 
property of the Gamma function. 


Example. If p > 0 and q > 0, we have the formula 


f 1 ^-‘(l - x)"- 1 dx = r(p)r( - g - . (47) 

Jo r (p+ q) 

The integral on the left is called the Beta function and is usually denoted by B(p, q). To 
prove (47) we let 



if t > 0, 
if t < 0. 


Then f p 6 L(R) and J f p (t) dt = Jo t p 1 e * dt = r(p). Let h denote the convolution, 
^ — fp* fa- Taking u = 0 in the convolution formula (44) we find, if p > 1 or q > 1, 


r kx) dx= r f p (o dt r ^ d V = np)n 9 ). 

J — oo J — oo J — oo 


(48) 


Now we calculate the integral on the left in another way. Since both f p and f Q vanish on 
the negative real axis, we have 



fp(t)f 9 (x - t)dt = 


c x 

e~ x t p ~ l (x - ty~ l dt 

Jo 

lo 


if x > 0, 
if x < 0. 


The change of variable t = ux gives us, for x > 0, 


h(x) = e X x p+q 1 J* u p 1 (1 — uf 1 du = e X x p+Q 1 B(p, q). 

Therefore h(x) dx = B(p 9 q) Jq e~ x x p+Q ~ 1 dx = B(p 9 q)T(p + q) which, when 
used in (48), proves (47) if p > 1 or q > 1. To obtain the result for p > 0, q > 0 use 
the relation pB(p 9 q) = (p + q)B(p + 1, q). 
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11.22 THE POISSON SUMMATION FORMULA 

We conclude this chapter with a discussion of an important formula, called 
Poisson's summation formula, which has many applications. The formula 
can be expressed in different ways. For the applications we have in mind, the 
following form is convenient. 


Theorem 11,24. Let fbe a nonnegative function such that the integral J” Q0 /(x) dx 
exists as an improper Riemann integral. Assume also that f increases on ( — oo, 0] 
and decreases on [0, + oo). Then we have 


^ /( m +) + f( m ~) 


+ OO 

E 


n— — oo 


°° f(t)e~ 2xi "‘ dt, 

- 00 



each series being absolutely convergent. 


Proof. The proof makes use of the Fourier expansion of the function F defined 
by the series 

+ oo 

F(x) = f(m + x). (50) 

m— — oo 

First we show that this series converges absolutely for each real x and that the 
convergence is uniform on the interval [0, 1]. 

Since / decreases on [0, + oo) we have, for x ^ 0, 

N N 

f(m + x) ^ m + £ f(m) < /(0) + 

m — 1 

Therefore, by the Weierstrass Af-test (Theorem 9.6), the series Y.m=of( m + x ) 
converges uniformly on [0, + oo). A similar argument shows that the series 
Zmi-oo/( w + x ) converges uniformly on (-oo, 1]. Therefore the series in (50) 
converges for all x and the convergence is uniform on the intersection 

(-oo, 1] n [0, +oo) = [0, 1]. 

The sum function F is periodic with period 1. In fact, we have F(x + 1) = 
T,m=-oof( m + x + 1), and this series is merely a rearrangement of that in (50). 
Since all its terms are nonnegative, it converges to the same sum. Hence 

F(x + 1) = F(x). 

Next we show that F is of bounded variation on every compact interval. If 
0 < x < i, then f(m + x) is a decreasing function of x if m > 0, and an in- 
creasing function of x if m < 0. Therefore we have 

oo — 1 

F{x) = f(m + x) - {~f(m + x)}, 

m — 0 m — — oo 

so F is the difference of two decreasing functions. Therefore F is of bounded 



•oo 

m dt. 
Jo 
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variation on [0, £]. A similar argument shows that F is also of bounded variation 
on [— i, 0]. By periodicity, F is of bounded variation on every compact interval. 
Now consider the Fourier series (in exponential form) generated by F, say 


+ 00 

F(x) - *«e 2ninx - 

n— — oo 


Since F is of bounded variation on [0, 1] it is Riemann-integrable on [0, 1], and 
the Fourier coefficients are given by the formula 


a 


n 


I* F(x)e~ 2ninx dx. 



Also, since F is of bounded variation on every compact interval, Jordan’s test 
shows that the Fourier series converges for every x and that 


F(X + ) + F(X~) = jp a e 2ninx 



To obtain the Poisson summation formula we express the coefficients a„ in 
another form. We use (50) in (51) and integrate term by term (justified by uniform 
convergence) to obtain 


a 


n 



f(m + x)e 2ninx 


dx . 


The change of variable t = m + x gives us 


+ 2° fm+l foo 

««= E f{t)e~ 2nint dt = me-^dt, 

">=-<*> Jm J -oo 

since e 2 * tmn = 1. Using this in (52) we obtain 

e 2nmx ( 53 ) 

When x = 0 this reduces to (49). 

note. In Theorem 1 1 .24 there are no continuity requirements on /. However, if 
/ is continuous at each integer, then each term f(m + x) in the series (50) is con- 
tinuous at x = 0 and hence, because of uniform convergence, the sum function F 
is also continuous at 0. In this case, (49) becomes 


F(x+) + F(x-) 


uu 


- E 


n = — oo 


oo 


f{t)e 


— 2nint 


dt 


— 00 


+ 00 

E /( m ) = 

m— — oo 



I f(t)e~ 2 * M dt. 




00 



The monotonicity requirements on / can be relaxed. For example, since each 
member of (49) depends linearly on /, if the theorem is true for /, and for f 2 then 
it is also true for any linear combination a,/, + a 2 f 2 - In particular, the formula 
holds for a complex-valued function f = u + iv if it holds for u and v separately. 
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Example 1. Transformation formula for the theta function. The theta function 0 is defined 
for all x > 0 by the equation 

0(x)= ^ 

n= — oo 

We shall use Poisson’s formula to derive the transformation equation 


0(x ) = ~ for x > 0. 

y/x W 


(55) 


For fixed a > 0, let f(x) = e a * 2 for all real x. This function satisfies all the hypoth- 
esis of Theorem 11.24 and is continuous everywhere. Therefore, Poisson’s formula 
implies 

+ 00 + 00 /*00 

23 e-™ 2 = £ e-* t2 e 2 * ,m dt. (56) 

m= — oo n= — oo J — oo 


The left member is 6(a/n). The integral on the right is equal to 

e-*< 2 e 2 * in ' dt =2 f” cos 2 jtjsT dt = A f°V* 2 cos^d* = 

J° vajo Va Va W«/ 

where 



-I 


00 


F(y) = 1 


e x cos 2xy rfx. 


But F(y) = ne~ y2 (see Exercise 10.22), so 



e -* ,2 e 2 * int dt 




Using this in (56) and taking a = nx we obtain (55). 


Example 2. Partial-fraction decomposition of coth x. The hyperbolic cotangent, coth x, 
is defined for x ^ 0 by the equation 


coth x = 


e 2x + 1 

e 2 * - r 


We shall use Poisson’s formula to derive the so-called partial-fraction decomposition 


coth x 


1 - A 1 

= - + 2x J2 ~~2 — 


2„2 


x + n n 


(57) 


for x > 0. For fixed a > 0, let 



if x > 0, 
if x < 0. 


Then /clearly satisfies the hypotheses of Theorem 1 1 .24. Also, /is continuous everywhere 
except at 0, where /(0+) = 1 and /(0 — ) = 0. Therefore, the Poisson formula implies 


n. — 1 


<T ma 


+ 00 /*oo 

23 e- a - 2Kin, dt. 

n= — oo JO 
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The sum on the left is a geometric series with sum l/(e® - 1), and the integral on the right 
is equal to l/(a + Inin). Therefore (58) becomes 


1 1 

_ 

2 e* - 1 


1 

= - + 

a 



1 

a 4- 2nin 



and this gives (57) when a is replaced by 2x. 


EXERCISES 


Orthogonal systems 

11.1 Verify that the trigonometric system in (1) is orthonormal on [0, 2n]. 

11.2 A finite collection of functions {(p Qi <p u . . . , q> m } is said to be linearly independent 
on [a, b] if the equation 

m 

^ c k q> k (x) = 0 for all x in [a 9 b] 

k=0 

implies c 0 = = - ••= c m = 0. An infinite collection is called linearly independent on 

[a, b] if every finite subset is linearly independent on [a, b]. Prove that every orthonormal 
system on [a, b] is linearly independent on [a 9 b]. 

11.3 This exercise describes the Gram-Schmidt process for converting any linearly inde- 
pendent system to an orthogonal system. Let {/ 0 , f l9 . . . } be a linearly independent 
system on [a 9 b] (as defined in Exercise 11.2). Define a new system {g 0i g l9 . . . } recur- 
sively as follows: 

r 

9o — fo, 9r + 1 = fr + 1 “ a k9k> 

k— 1 

where a k =■ (f r+l9 g k )l(g k9 g k ) if ||^ fc || ^ 0, and a k = 0 if ||^ fc || = 0. Prove that g n+l is 
orthogonal to each of g 09 g l9 . . . , g n for every n > 0. 

11.4 Refer to Exercise 11.3. Let (/, g) = JLi f(t)g(t) dt. Apply the Gram-Schmidt 
process to the system of polynomials {1, t 9 1 2 9 . . . } on the interval [—1, 1 ] and show that 

9 i(t) = t 9 g 2 (t) = t 2 - g^{t) = t 3 - f t 9 gjit) = t* - fr 2 + 

11.5 a) Assume /e R on [0, 2n] 9 where /is real and has period 2n. Prove that for every 

e > 0 there is a continuous function g of period 2n such that || / - g\\ < e. 
Hint. Choose a partition P E of [0, 27r] for which / satisfies Riemann’s condition 
U (P 9 f) - L(P 9 f) < e and construct a piecewise linear g which agrees with / 
at the points of P e . 

b) Use part (a) to show that Theorem 11.16(a), (b) and (c) holds if /is Riemann 
integrable on [0, 2n]. 

11.6 In this exercise all functions are assumed to be continuous on a compact interval 
[a 9 b]. Let {<p 09 <p l9 . . . } be an orthonormal system on [a 9 b]. 

a) Prove that the following three statements are equivalent. 
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1) (/, <P n ) = (0, <Pn) for all n implies / = g. (Two distinct continuous functions 
cannot have the same Fourier coefficients.) 

2) (/, <p„) = 0 for all n implies / = 0. (The only continuous function orthogonal 
to every q> n is the zero function.) 

3) If T is an orthonormal set on [a, b] such that {tp 09 <p l9 . . . } £ T, then 
{<p 0 , tpi , . . . } = T. (We cannot enlarge the orthonormal set.) This property is 
described by saying that {<p 0 , <p l9 . . . } is maximal or complete. 

b) Let (p n {x) = e inx j\! In for n an integer, and verify that the set {(p n : ne Z} is com- 
plete on every interval of length 2n. 

11.7 If x e R and n = 1,2,..., \ttf n (x) = (x 2 - 1)" and define 

= i, km = - 1 - rt n \x). 

£ It 9 

It is clear that </>„ is a polynomial. This is called the Legendre polynomial of order n. The 
first few are 

KM = x, <j> 2 M = \x 2 - i, 

KM = f* 3 - ix, KM - + f. 

Derive the following properties of Legendre polynomials: 

a) KM = xK-iM + nK-iM- 

b) KM = xK-iM + K-iM- 

n 

c) (« + 1)^„ +1 (jc) = (2« + 1 )xKM ~ nK-iM- 

d) K satisfies the differential equation [(1 — x 2 )y']' + «(n + l)y = 0. 

e) [(1 - x 2 ) A(*)]' + [trim + 1) - n(n + \)]<t> m M<t> n M = 0, 

where A = K<t> m - KK- 

f) The set {^ 0 , 4> x , <j > 2 , . . . } is orthogonal on [—1, 1 ]. 

g) |* <j> 2 n dx = ^ f K- i dx. 

J-l 2n + 1 J_, 

h) f <f>n dx = -- -- - . 

J-i 2 n + 1 

note. The polynomials 

. x 2"(«!) 2 , . . 

0n(O = ,, &(0 

(2n)! 

arise by applying the Gram-Schmidt process to the system {1, t, t 2 , . . . } on the interval 
[— 1, 1]. (See Exercise 11.4.) 
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Trigonometric Fourier series 

11.8 Assume that f e L([—n 9 n]) and that / has period 2n. Show that the Fourier series 
generated by / assumes the following special forms under the conditions stated: 

a) If /(— x) = f(x) when 0 < x < n 9 then 

ci 00 2 C n 

f(x) ~ ~ a n cos nx 9 where a n = - I /(/) cos nt dt. 

2 ^ i n Jo 

b) If f(—x) = — /(x) when 0 < x < n 9 then 

2 C n 

f(x) ~ > b„ sin nx 9 where b n = - I /(/) sin nt dt . 

* Jo 


In Exercises 11.9 through 11.15, show that each of the expansions is valid in the range 
indicated. Suggestion. Use Exercise 11.8 and Theorem 11.16(c) when possible. 

00 


11.9 a) * = n - 2 ^ 


n= 1 
2 


sin wx 


/i 


00 


b)^ = ^_?L + 22] 


n= 1 


COS /IX 


/I' 


note. When x = 0 this gives C(2) = 7r 2 /6. 


7T 


oo 


11.10 a) - = 
4 


sin (2n — 1 )jc 


S 2«- 1 


>1 00 

b) * ■ ; - ■ E 


4 cos (2n — l)x 


2 (2« - 1)' 
00 ✓ «\#|— 1 


11.11 a) * = 


11=1 


(—l)" sin /ix 


n 


2 oo 

b )* 2 = T + 4 E 


(-1)" COS /IX 


n= 1 


/T 


if 0 < x < 2n. 

if 0 < x < 27r. 


if 0 < x < 7T. 

if 0 < x < 7T. 

if — 7T < X < 7T. 

if —7 r < X < 7T. 


11.12 X 2 = - 7T 2 + 4 


8 


00 


11.13 a) cos x = - V* 


7r rr 


n= 1 


• v . 2 4 

b) sin x = 


7T 7T 

11.14 a) x cos x = — i 
b) x sin x = 1 — 



7i sin nx\ 


n sin 2nx 
4n 2 — 1 


£ 
n — 1 


cos 2/ix 
4/i 2 — 1 


> 


sin x + 


n-O 


(— 1 ) n n sin /ix 


i COS X - 


H 1 


(— 1)" COS /IX 
/I 2 - 1 


if 0 < x < 27 t. 

ifO < x < 7i. 

if 0 < x < 7 r. 

if —7T < X < 7T. 

if — 7T < X < 7T. 
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11.15 a) log 


b) log 


sin 


00 


COS 


c) log 


tan 


2 

x 

2 

x 


= -log 2 - 2 


n= 1 

oo 


cos nx 


n 


if x & 2kn (k an integer). 


= —log 2 — ^ 


(— 1)" cos nx 


n= 1 


n 


00 


cos (2 n — l)x 


= -2y] 

2/i-i 


if x t* (2A: + l)7r. 


if x &7T. 


11.16 a) Find a continuous function on [— 7r, 7r] which generates the Fourier series 

SnLi (~1 )"//” 3 sin wx. Then use Parseval’s formula to prove that £(6) = 

7t 6 /945. 

b) Use an appropriate Fourier series in conjunction with Parseval’s formula to 
show that £(4) = 7r 4 /90. 

11.17 Assume that /has a continuous derivative on [0, 27r], that /( 0) = f(2n ), and that 
S 2 0 *m dt = 0. Prove that ||/'|| > ||/||, with equality if and only if /(x) = a cos x + 
b sin x . Hint . Use Parseval’s formula. 

11.18 A sequence {B n } of periodic functions (of period 1) is defined on R as follows: 


+ 1 2(2//)! ^ cos 2nkx 


B 2n (x) = (-1) B+1 Y 

(2n) 2n £i k 2n 


(n = 1, 2,. . .), 


s (r) = (— 1Y* +1 '^“ n sin 2nkx — i\ t \ 

2n+l\ x ) ( U ^ \2n+ 1 jLj k 2n+1 ^ * * ■)• 


(2ny * =1 

(B n is called the Bernoulli function of order n.) Show that : 

a) B x (x) = x - [x] - i if x is not an integer, ([x] is the greatest integer <x.) 

b) Jo B n (x) dx - 0 if n > 1 and B' n (x) = nB n _ x (x) if n > 2. 

c) B n (x) = P n (x) if 0 < x < 1, where P n is the nth Bernoulli polynomial. (See 
Exercise 9.38 for the definition of P n .) 


d) B n (x) = — Y 

(2 ni)\^ 


°° e 2nikx 


k n 


(n = 1, 2, . . .). 


k± o 


11.19 Let /be the function of period 2n whose values on [-n,n] are 

/(X) =1 ifO < X < 7T, /(X) = -l if — 7T < x < 0, 

/(x) = 0 if x = 0 or x = n. 


a) Show that 


00 


m - 1 £ 

71 11=1 


sin (2n — l)x 
2n — 1 


for every x. 


This is one example of a class of Fourier series which have a curious property known as 
Gibbs' phenomenon. This exercise is designed to illustrate this phenomenon. In that which 
follows, s„(x) denotes the nth partial sum of the series in part (a). 
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b) Show that 

, x 2 C x sin 2 nt , 

*„(*) = - dt. 

n Jo sin / 

c) Show that, in (0, i r), s n has local maxima at x u x 3 , . . . , x 2n - 1 and local minima 
at x 2 , x 4 , . . . , x 2 „— 2 9 where x m = imn/n (m — 1, 2 , . . . , 2 w — 1). 

d) Show that s n ($n/n) is the largest of the numbers 


^n(^m) (ftt 1» 2, . . . , 2/2 1). 

e) Interpret s n (in/ri) as a Riemann sum and prove that 


lim s n 

n-+oo 



sin t 
t 


dt . 


The value of the limit in (e) is about 1.179. Thus, although / has a jump equal to 2 at the 
origin, the graphs of the approximating curves s n tend to approximate a vertical segment 
of length 2.358 in the vicinity of the origin. This is the Gibbs phenomenon. 

11.20 If/(x) ~ a 0 l 2 + 5X=i ( a n cos nx + b n sin nx) and if /is of bounded variation on 
[0, 27 t], show that a n — 0(1 /n) and b n = 0(1 In). Hint. Write f = g — h, where g and h 
are increasing on [0, 27r]. Then 


1 C 2n 1 C 2n 

a n = — g(x) d( sin nx) I h( x) rf(sin nx). 

nn Jo nn J 0 

Now apply Theorem 7.31. 

11.21 Suppose g e L([a , <5]) for every a in (0, S) and assume that g satisfies a “right- 
handed” Lipschitz condition at 0. (See the Note following Theorem 1 1.9.) Show that the 
Lebesgue integral Jo \g(t) - #(0+)|/f dt exists. 

11.22 Use Exercise 11.21 to prove that differentiability of /at a point implies convergence 
of its Fourier series at the point. 

11.23 Let g be continuous on [0, 1 ] and assume that JJ t n g(t) dt = 0 for n = 0, 1, 2, 

Show that : 

a) Jo 9(t) 2 dt = Jo g(t)(g(t) - P(t)) dt for every polynomial P. 

b) Jo g(t) 2 dt = 0. Hint. Use Theorem 11.17. 

c) g(t) — 0 for every t in [0, 1 ]. 


11.24 Use the Weierstrass approximation theorem to prove each of the following state- 
ments. 

a) If /is continuous on [1, + oo) and if f(x) -► a as x -► + oo, then / can be uni- 
formly approximated on [1, + oo) by a function g of the form g(x) = p(ljx\ 
where p is a polynomial. 

b) If / is continuous on [0, + oo) and if f(x) -* a as x -* + oo, then / can be 
uniformly approximated on [0, + oo) by a function g of the form g(x) = p(e~ x ), 
where p is a polynomial. 

11.25 Assume that f(x) - a 0 j 2 + (a n cos nx + b n sin nx) and let {o n } be the 
sequence of arithmetic means of the partial sums of this series, as it was given in (23). 
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Show that: 


II— 1 / IV 

a) o n (x) = — + V 1 1 (a k cos kx + b k sin kx ). 

2 k=i\ n) 


b) f | fix') - a n (x)\ 2 dx = I* |/(*)| : 

Jo Jo 


dx 


fc= 1 


- * al - n (al + b\) + ^ ^ k 2 {a\ + b\). 


c) If /is continuous on [0, 2n\ and has period 2n 9 then 


lim ~~2 2 + = 


II -♦00 ft k=l 


11.26 Consider the Fourier series (in exponential form) generated by a function /which is 
continuous on [0, 2 n] and periodic with period 2n , say 

/(*> ~ 53 a " g, " x - 

n= — oo 


Assume also that the derivative /' e R on [0, 2n ]. 

a) Prove that the series ££J°-ao « 2 |<*J 2 converges; then use the Cauchy-Schwarz 

inequality to deduce that |<* n | converges. 

b) From (a), deduce that the series converges uniformly to a con- 

tinuous sum function g on [0, 2n]. Then prove that / = g. 


Fourier integrals 

11.27 If / satisfies the hypotheses of the Fourier integral theorem, show that: 

a) If /is even, that is, if /(— /) = /(/) for every /, then 

+ /(*•“) _ ? r cos vx r r cos vu ^ 

2 7Ta-* + oo Jo |_Jo J 

b) If /is odd, that is, if /(— /) = — /(/) for every /, then 

/(JC+) + fix-) = 2 lim f‘ sin vx r j* 00 sin du \ ft 

2 7Ta-* + oo Jo L Jo J 

Use the Fourier integral theorem to evaluate the improper integrals in Exercises 11.28 
through 11.30. Suggestion. Use Exercise 11.27 when possible. 


11.28 



sin v cos vx 



v 


1 

0 

i 


if —1 < x < 1, 
if \x\ > 1, 
if \x\ = 1. 


11.29 



cos ax 
b 2 + x 2 




if b > 0. 


Hint. Apply Exercise 11.27 with f(u) = e~ b ^ u K 
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11.30 f 0 -? 1 ** 

Jo 1 + 


dx = ± n -e-"\ 
a\ 2 


if a # 0. 


11.31 a) Prove that 


nip) 


fl/2 

- 2 l 


x’-ta - X y~ l dx. 


b) Make a suitable change of variable in (a) and derive the duplication formula for 
the Gamma function : 

r( 2 p)r(i) = 2 2p - 1 r(p)r(p + ±). 

NOTE. In Exercise 10.30 it is shown that r(i) = Vtt. 

11.32 IS f{x) — e~* 2/2 and g(x) = xf(x) for all x, prove that 


fiy) 


-M 


fix) cos x y dx and g{y) 


Jl 


poo 


g(x) sin xy dx . 


o 


11.33 This exercise describes another form of Poisson’s summation formula. Assume 
that / is nonnegative, decreasing, and continuous on [0, + oo) and that /?/(*) dx exists 
as an improper Riemann integral. Let 


g(y) 


-M 


fix) cos xy dx. 


If a and 0 are positive numbers such that a0 = In, prove that 


■r*\ 


i/(0) + = '/& [ \g(Q) + V* g(nfi) 

m=l ) 


n= 1 


11.34 Prove that the transformation formula (55) for 6{x) can be put in the form 


■r*\ 


00 


t + Z 


t -a 2 m 2 l 2 


m= 1 


) = V/? + f; e-wA , 


where a0 = In, a > 0. 

11.35 If s > 1, prove that 


n~ sl2 r 


(i)-- - f 


e —n* Xx s/2-l fa 


and derive the formula 


- r 


7t_5/2 r( - lew = v{x)x*' 2 -' dx. 
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where 2y/(x) = 0{x) - 1 . Use this and the transformation formula for 0{x) to prove that 

7 r s/2 rY-^ CW = — - — + f (^ /2_1 + x (1 “ s)/2 " 1 )^)& 

\2/ s(s - 1) Ji 


Laplace transforms 

Let c be a positive number such that the integral Jo e~ ct |/(0| dt exists as an improper 
Riemann integral. Let z — x + iy 9 where x > c. It is easy to show that the integral 

F(z) = P e~ z, m dt 

exists both as an improper Riemann integral and as a Lebesgue integral. The function F 
so defined is called the Laplace transform of f denoted by .£?(/). The following exercises 
describe some properties of Laplace transforms. 


11.36 Verify the entries in the following table of Laplace transforms. 


m 

F(z) = ft e~ zt f{t) dt 

z = x + iy 

e** 

(z - a)- 1 

(x > a) 

cos <xt 

z/(z 2 + a 2 ) 

(x > 0) 

sin at 

a/(z 2 + a 2 ) 

(x > 0) 

fV" 

r Xp + 1)1(2 - a) p+1 

(x > a 9 p > 0) 


11.37 Show that the convolution h = f * g assumes the form 




f(x)g(t - x) dx 


when both / and g vanish on the negative real axis. Use the convolution theorem for 
Fourier transforms to prove that J£?(/* g) = <£?(/) • S£{g\ 

11.38 Assume/is continuous on (0, + oo) and let F(z) = Jo’ e~ zt f(t) dt for z = x + iy, 
x > c > 0. If s > c and a > 0 prove that: 

a) F(s + a) = a Jo g(t)e~ at dt , where g(x) = Jo e~ st f(t) dt. 

b) If F{s + no) = 0 for n = 0, 1, 2, . . . , then f(t) = 0 for t > 0. Hint . Use 
Exercise 11.23. 

c) If h is continuous on (0, + oo) and if / and h have the same Laplace transform, 
then f(t) = h{t) for every t > 0. 

11.39 Let F(z) = J£ e~ zt f(t) dt for z = x + iy,x > c > 0. Let t be a point at which/ 
satisfies one of the “local” conditions (a) or (b) of the Fourier integral theorem (Theorem 
11.18). Prove that for each a > c we have 

f(t+) + /('- ) = J. i im f e (a+iv), F (a + iv ) dv. 

2 2rc r-* + oo J _ f 

This is called the inversion formula for Laplace transforms . The limit on the right is usually 
evaluated with the help of residue calculus, as described in Section 16.26. Hint . Let 
g (t) = e~ at /(/)forf > 0, g(t) = Ofort < 0, and apply Theorem 11.19 to g. 
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CHAPTER 12 


MULTIVARIABLE DIFFERENTIAL 

CALCULUS 


12.1 INTRODUCTION 

Partial derivatives of functions from R" to R 1 were discussed briefly in Chapter 5. 
We also introduced derivatives of vector-valued functions from R 1 to R". This 
chapter extends derivative theory to functions from R n to R m . 

As noted in Section 5.14, the partial derivative is a somewhat unsatisfactory 
generalization of the usual derivative because existence of all the partial derivatives 
D\f \ . . . , D n f at a particular point does not necessarily imply continuity of / at 
that point. The trouble with partial derivatives is that they treat a function of 
several variables as a function of one variable at a time. The partial derivative 
describes the rate of change of a function in the direction of each coordinate axis. 
There is a slight generalization, called the directional derivative , which studies the 
rate of change of a function in an arbitrary direction. It applies to both real- and 
vector-valued functions. 

12.2 THE DIRECTIONAL DERIVATIVE 

Let S be a subset of R n , and let f : S -> R m be a function defined on S with values 
in R m . We wish to study how f changes as we move from a point c in S along a 
line segment to a nearby point c + u, where u ^ 0. Each point on the segment 
can be expressed as c + Au, where h is real. The vector u describes the direction 
of the line segment. We assume that c is an interior point of S. Then there is an 
/r-ball B{ c; r) lying in 5, and, if h is small enough, the line segment joining c to 
c + hn will lie in B( c; r) and hence in S. 

Definition 12.1 . The directional derivative of f at c in the direction u, denoted by 
the symbol f'(c; u), is defined by the equation 

f (c; u) = lim — — , (1) 

h-+ o h 

whenever the limit on the right exists . 

note. Some authors require that ||u|| = 1, but this is not assumed here. 

Examples 

1. The definition in (1) is meaningful if u = 0. In this case f '(c; 0) exists and equals 0 
for every c in S. 
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2. If u — 11 *, the kth unit coordinate vector, then f '(c; u k ) is called a partial derivative 
and is denoted by D k f(c). When f is real-valued this agrees with the definition given 
in Chapter 5. 

3. If f = (f u . . . 9 f m ) 9 then f'(c; u) exists if and only if f' k ( c; u) exists for each k = 
1, 2, . . . , m 9 in which case 

f'(c; u) = (//( c; u), . . . ,/ m '(c; u)). 

In particular, when u = u* we find 

A.f(c) = (i>fc/i(c), . . . , DJJc )). (2) 

4. If F(f) = f(c + fu), then F'(0) = f'(c; u). More generally, F'(f) = f'(c + fu; u) if 
either derivative exists. 

5. If/(x) = ||x|| 2 , then 

F(t) = /( c + tu) = (c + fu) • (c + fu) 

= |c|| 2 + 2fc-u + f 2 1 u || 2 , 

so F'{t) = 2c • u + 2f ||u|| 2 ; hence F'(0) = /'( c; u) = 2c • u. 

6. Linear functions. A function f : R" -*• R m is called linear if f(ox + by) = af(x) + M(y) 
for every x and y in R" and every pair of scalars a and b. If f is linear, the quotient 
on the right of (1) simplifies to f(u), so f '(c; u) = f(u) for every c and every u. 


12.3 DIRECTIONAL DERIVATIVES AND CONTINUITY 

If f'(c; u) exists in every direction u, then in particular all the partial derivatives 
D t f(c), ... , D„f(c) exist. However, the converse is not true. For example, 
consider the real-valued function / : R 2 -+ R 1 given by 


/(X. y) - {; + ' 


if x = 0 or y = 0, 
otherwise. 


Then D l f(0, 0) = D 2 f(0, 0) = 1 . Nevertheless, if we consider any other direction 
u = (a u a 2 ), where a t # 0 and a 2 # 0, then 

/( 0 + hu) - f(0) = f(hu) = 1 
h h h’ 

and this does not tend to a limit as h -* 0. 

A rather surprising fact is that a function can have a finite directional derivative 
f'(c; u) for every u but may fail to be continuous at c. For example, let 


Let u 


/(*, y) - \ xy ‘ Kx ‘ + “ 

[0 if 

= (a 1 , a 2 ) be any vector in R 2 . Then we have 


if x # 0, 
if x = 0. 


/( 0 + hu) - /( 0) _ f(ha u ha 2 ) a x a\ 


h h a 2 + h 2 < 
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and hence 

/'( 0 ; .) = \f a ' l * "• 

[0 if a t = 0. 

Thus,/'(0; u) exists for all u. On the other hand, the function / takes the value \ 
at each point of the parabola x = y 1 (except at the origin), so / is not continuous 
at (0, 0), since /( 0, 0) = 0. 

Thus we see that even the existence of all directional derivatives at a point fails 
to imply continuity at that point. For this reason, directional derivatives, like 
partial derivatives, are a somewhat unsatisfactory extension of the one-dimensional 
concept of derivative. We turn now to a more suitable generalization which implies 
continuity and, at the same time, extends the principal theorems of one-dimensional 
derivative theory to functions of several variables. This is called the total derivative. 


12.4 THE TOTAL DERIVATIVE 

In the one-dimensional case, a function / with a derivative at c can be approximated 
near c by a linear polynomial. In fact, if /'(c) exists, let E c (h) denote the difference 

E c (h) = f(c + h) ~ m - /'(c) if h ± 0, (3) 

h 

and let E c { 0) = 0. Then we have 

f(c + h)= /(c) + f(c)h + hE c (h), (4) 

an equation which holds also for h = 0. This is called the first-order Taylor 
formula for approximating f(c + h) — /(c) by f(c)h. The error committed is 
hE c (h). From (3) we see that E c (h) -* 0 as h 0. The error hE c {h) is said to be 
of smaller order than h as h 0. 

We focus attention on two properties of formula (4). First, the quantity 
f(c)h is a linear function of h. That is, if we write T c (h) = f!(c)h, then 

Tfah, + bh 2 ) = aTfh,) + bT c (h 2 ). 

Second, the error term hE c (h) is of smaller order than h as h -* 0. The total 
derivative of a function f from R" to R m will now be defined in such a way that it 
preserves these two properties. 

Let f : S R m be a function defined on a set S in R" with values in R”. Let c 
be an interior point of S, and let B( c; r) be an n-ball lying in S. Let v be a point 
in R" with ||v|| < r, so that c + v e B( c; r). 

Definition 12.2. The function f is said to be differentiable at c if there exists a linear 
function T c : R" -*■ R m such that 

f(c + v) = f(c) + T c (v) + || v|| E c (v), 

where E c (v) -* 0 as v -» 0. 


( 5 ) 
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note. Equation (5) is called a first-order Taylor formula. It is to hold for all v in 
R" with ||v|| < r. The linear function T c is called the total derivative of f at c. We 
also write (5) in the form 

f(c + v) = f(c) + T c (v) + o(||v||) as v -*■ 0. 

The next theorem shows that if the total derivative exists, it is unique. It also 
relates the total derivative to directional derivatives. 

Theorem 12.3. Assume f is differentiable at c with total derivative T c . Then the 
directional derivative f'(c; u) exists for every u in R" and we have 

T c (u) = f'(c; u). (6) 

Proof. If v = 0 then f'(c; 0) = 0 and T c (0) = 0. Therefore we can assume that 
v ^ 0. Take v = Au in Taylor’s formula (5), with A # 0, to get 

f(c + ha)- f(c) = T c (Au) + ||Au|| E c (v) = AT c (u) + |A| ||u|| E c (v). 

Now divide by A and let A -» 0 to obtain (6). 

Theorem 12.4. If f is differentiable at c, then f is continuous at c. 

Proof. Let v -> 0 in the Taylor formula (5). The error term ||v|| E c (v) -* 0; the 
linear term T c (v) also tends to 0 because if v = d 1 u 1 + • • • + v H u„, where 
u t , . . . , u„ are the unit coordinate vectors, then by linearity we have 

T c (u) = ^T c ( Ul ) + • • • + f„T c (u n ), 

and each term on the right tends to 0 as v -» 0. 

note. The total derivative T c is also written as f '(c) to resemble the notation used 
in the one-dimensional theory. With this notation, the Taylor formula (5) takes 
the form 

f(c + v) = f(c) + f'(c)(v) + ||v|| E c (v), (7) 

where E c (v) -> 0 as v -* 0. However, it should be realized that f'(c) is a linear 
function, not a number. It is defined everywhere on R"; the vector f'(c)(v) is the 
value of f '(c) at v. 

Example. If f is itself a linear function, then f(c + v) = f(c) + f(v), so the derivative 
f '(c) exists for every c and equals f. In other words, the total derivative of a linear function 
is the function itself. 

12.5 THE TOTAL DERIVATIVE EXPRESSED IN TERMS OF PARTIAL 
DERIVATIVES 

The next theorem shows that the vector f '(c)(v) is a linear combination of the partial 
derivatives of f. 

Theorem 12.5. Let f : S' -» R m be differentiable at an interior point c of S, where 
S £ R". If v = fjUj + • • • + r> n u n , where u, u„ are the unit coordinate 
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vectors in R", then 

n 

f'(c)(v) = J2 v k D k f (c). 

k = 1 

In particular , if f is real-valued (m = 1) we Acwe 

/'(c)(v) = V/(c) • v, (8) 

fAe dot product of v w/7A fAe vector V/(c) = (/^/(c), . . . , Z>„/(c)). 

Proof We use the linearity of f '(c) to write 

ft ft 

f'(c)(v) = 2 f, (cX*Wk) = 2 ^ f '(c)(«k) 

*=1 k = 1 

n n 

= 2 w * f '(c; an) = 2 ^ D * f ( c )- 

k = 1 *=1 

note. The vector V/(c) in (8) is called the gradient vector of / at c. It is defined 
at each point where the partials D k f . . . , D„f exist. The Taylor formula for 
real-valued / now takes the form 

/( c + v) = /(c) + V/(c) • v + o(||v||) as v -> 0. 

12.6 AN APPLICATION TO COMPLEX-VALUED FUNCTIONS 

Let / = u + iv be a complex-valued function of a complex variable. Theorem 
5.22 showed that a necessary condition for / to have a derivative at a point c is that 
the four partials D t u, D 2 u, D k v, D 2 v exist at c and satisfy the Cauchy-Riemann 
equations : 

D k u{c) = D 2 v(c), D k v(c) = —D 2 u(c). 

Also, an example showed that the equations by themselves are not sufficient for 
existence of f(c). The next theorem shows that the Cauchy-Riemann equations, 
along with differentiability of u and v, imply existence of fie). 

Theorem 12.6. Let u and v be two real-valued functions defined on a subset S of the 
complex plane. Assume also that u and v are differentiable at an interior point c 
of S and that the partial derivatives satisfy the Cauchy-Riemann equations at c. 
Then the function f = u + iv has a derivative at c. Moreover, 

f{c) = Dyu(c) + iDyv(c). 

Proof. We have /(z) — /(c) = w(z) — m(c) + i{v(z) — c(c)} for each z in S. 
Since each of u and v is differentiable at c, for z sufficiently near to c we have 

m(z) — u(c ) = Vw(c) • (z — c) -I- o(||z — c||) 
v{z) — c(c) = Vc(c) • (z — c) -I- o(||z — c||). 


and 
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Here we use vector notation and consider complex numbers as vectors in R 2 . We 
then have 

/( z ) - f( c ) = { V«(c) + i Vi>(c)} • (z - c) + o(\\z - c||). 

Writing z = x + iy and c = a + ib, we find 
{ Vw(c) + i Vi>(c)} ■ (z — c) 

= D t u(c)(x — a) + D 2 u(c)(y - b) + i {D x v{c){x - a) + D 2 v(c)(y — b )} 

= D k u{c){(x - a) + i(y - 6)} + iD x v(c){(x - a) + i(y - b)}, 

because of the Cauchy-Riemann equations. Hence 

f( z ) ~ f(c ) = i D i u (c) + iD t v(c)} (z - c) + o(\\z - c||). 

Dividing by z — c and letting z -* c we see that f'(c) exists and is equal to 

D lU (c) + iD t v(c). 

12.7 THE MATRIX OF A LINEAR FUNCTION 

In this section we digress briefly to record some elementary facts from linear 
algebra that are useful in certain calculations with derivatives. 

Let T :-R" -> R m be a linear function. (In our applications, T will be the 
total derivative of a function f.) We will show that T determines an m x n matrix 
of scalars (see (9) below) which is obtained as follows : 

Let u t , . . . , u„ denote the unit coordinate vectors in R". If x e R" we have 
x = x 1 u 1 + • • • + x n u„ so, by linearity, 

n 

T(x) = ^ x k J(u k ). 

k= 1 

Therefore T is completely determined by its action on the coordinate vectors 

’ 9 ®fl* 

Now let e l5 . . . , e m denote the unit coordinate vectors in R m . Since T(u*) e R m , 
we can write T(u k ) as a linear combination of e 1( . . . , e m , say 

m 

T(u*) = fuft. 

i = 1 

The scalars t lk , , t mk are the coordinates of T(u t ). We display these scalars 
vertically as follows : 
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This array is called a column vector. We form the column vector for each of 
T^), . . . , T(u„) and place them side by side to obtain the rectangular array 


hi 

hi 

• • « 

hn 

hi 

• 

• 

hi 

• 

• 

• • • 

hn 

• 

• 

• 

hi 

• 

hi 

• • • 

• 

hn 



This is called the matrix * of T and is denoted by m(T). It consists of m rows and 
n columns. The numbers going down the A:th column are the components of 
T(uJ. We also use the notation 


m( T) = [t ik ]Tf=i or m(T) = (t ik ) 
to denote the matrix in (9). 

Now let T : R n -> R m and S : R m -> R p be two linear functions, with the domain 
of S containing the range of T. Then we can form the composition S ° T defined by 

(S o T)(x) = S[T(x)] for all x in R". 

The composition S o T is also linear and it maps R n into R p . 

Let us calculate the matrix m(S <?T). Denote the unit coordinate vectors in 
R n , R m , and R p , respectively, by 


U|, . . . , u„, e l5 . • • , Cm) and w^, . . . , w p. 

Suppose that S and T have matrices (s^) and (f l7 ), respectively. This means that 


and 


Then 


so 


p 

S(e fc ) = ^ 5 lfc w l - for k = 1 , 2, . . . , m 

i= 1 


m 


T( u 7 ) = ^2 t kj e k for j = 1,2 


k= 1 


m 


(S ° T)(Uj) = S[T( Uj )] = 2 ^S(e*) = Z hj Z 


k= 1 


m p 

l 

k= 1 i="l 


P / m \ 

= Z ( Z Si^kj I w, 

i— 1 \fc= 1 / 


m(SoT) = 


rf' s , T - 

*ik l kj 

k = 1 • i 


In other words, m(S o T) is a p x n matrix whose entry in the zth row and y’th 


* More precisely, the matrix of T relative to the given bases u l9 . . . , u n of R” and 
e l5 . . . , e,„ of R m . 
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column is 


m 



k=l 


the dot product of the / th row of m(S) with the yth column of m( T). This matrix 
is also called the product m(S)m(T). Thus, m{ S °T) = m(S)m(T). 


12.8 THE JACOBIAN MATRIX 


Next we show how matrices arise in connection with total derivatives. 

Let f be a function with values in R m which is differentiable at a point c in R", 
and let T = f'(c) be the total derivative of f at c. To find the matrix of T we 
consider its action on the unit coordinate vectors u l5 . . . , u n . By Theorem 12.3 
we have 

T(u k) = f'(c; U*) = D k f( c). 

To express this as a linear combination of the unit coordinate vectors e l5 . . . , e m of 
R m we write f = (/i, . . . ,/J so that D k f = (D k f u . . . , D k f m ) , and hence 


T(u*) = D k f(c) = £ D k f{ c)e,.. 

; = i 


Therefore the matrix of T is m(T) = (D k f i (c)y This is called the Jacobian matrix 
of f at c and is denoted by Df(c). That is, 


Df(c) = 


DJ t (c) D 2 f i(c) 

DJ 2 (c) D 2 f 2 (c) 
• • 


DJticY 

DJii c) 


IDJJc) D 2 f m (c) ••• D n f m ( c)J 



The entry in the /th row and &th column is D k f t ( c). Thus, to get the entries in the 
A:th column, differentiate the components of f with respect to the A:th coordinate 
vector. The Jacobian matrix Df(c) is defined at each point c in R w where all the 
partial derivatives D k f { { c) exist. 

The A:th row of the Jacobian matrix (10) is a vector in R w called the gradient 
vector of /*, denoted by V/*(c). That is, 


Y/*(c) = (DJ k ( c), . . . , D n f k ( c)). 

In the special case when /is real-valued (m = 1), the Jacobian matrix consists 
of only one row. In this case D/(c) = V/( c), and Equation (8) of Theorem 12.5 
shows that the directional derivative /'(c ; v) is the dot product of the gradient 
vector V/( c) with the direction v. 

For a vector- valued function f = (f l9 . . . ,/ m ) we have 


f '(c)(v) = f'(c; v) = 



k= 1 


/fc(c; v)e fc 


m 


E {VA(c) • v} 

Ir — 1 


e 




( 11 ) 
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so the vector f '(c)(v) has components 


( V/i(c) • v, . . . , V/ m (c)-v). 

Thus, the components of f'(c)(v) are obtained by taking the dot product of the 
successive rows of the Jacobian matrix with the vector v. If we regard f'(c)(v) as 
an m x 1 matrix, or column vector, then f'(c)(v) is equal to the matrix product 
Df(c)v, where Df(c) is the m x n Jacobian matrix and v is regarded as an tt x 1 
matrix, or column vector. 


note. Equation (11), used in conjunction with the triangle inequality and the 
Cauchy-Schwarz inequality, gives us 


f'(c)(v)|| = 


m 


k= 1 


m 


m 


E {V/*(c) • v}e„ < 2 |V/*(c) • v| < ||v|| £ IIV/ t (c)|| 


k= 1 


k= 1 


Therefore we have 


||f'(c)(v)|| < MM, 



where M = X* =1 || V/*(c)||. This inequality will be used in the proof of the chain 
rule. It also shows that f'(c)(v) 0 as v -► 0. 


12.9 THE CHAIN RULE 

Let f and g be functions such that the composition h = f ® g is defined in a 
neighborhood of a point a. The chain rule tells us how to compute the total 
derivative of h in terms of total derivatives of f and of g. 

Theorem 12.7 . Assume that g is differentiable at a, with total derivative g'(a). Let 
b = g(a) and assume that f is differentiable at b, with total derivative f'(b). Then 
the composite function h = f ° g is differentiable at a, and the total derivative h'(a) 
is given by 

h'(a) = f '(b) o g'(a), 

the composition of the linear functions f'(b) and g'(a). 

Proof We consider the difference h(a + y) — h(a) for small ||y||, and show that 
we have a first-order Taylor formula. We have 

h(a + y) - h(a) = f[g(a + y)] - f[g(a)] = f(b + v) - f(b), (13) 

where b = g(a) and v = g(a + y) — b. The Taylor formula for g(a + y) implies 

▼ = g'(«)(y) + llyll E,(y), where E„(y) 0 as y -► 0. (14) 

The Taylor formula for f(b + v) implies 

f(b + v) — f(b) = f'(b)(v) + ||v|| E b (v), where E b (v) -* 0 as v -> 0. (15) 
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Using (14) in (15) we find 

f(b + v) - f(b) = f'(b)[g'(a)(y)] + f'(b)[||y|| E a (y)] + ||v|| E„(v) 

= f'fl>)[g'(a)(y)] + llyll E(y), (16) 

where E(0) = 0 and 

E(y) = f'(b)[E,(y)] + M E*(v) ify*0. (17) 

llyll 

To complete the proof we need to show that E(y) -» 0 as y -> 0. 

The first term on the right of (17) tends to 0 as y -+ 0 because E a (y) -* 0. In the 
second term, the factor Ej,(v) -*• 0 because v — * 0 as y -*• 0. Now we show that 
the quotient ||v||/||y|| remains bounded as y 0. Using (14) and (12) to estimate 
the numerator we find 

INI < l|g'(a)(y)|| + ||y|| ||E a (y)|| < ||y||{Af + ||E a (y)||}, 
where M = £” =1 ||V^(a)||. Hence 

< M + ||E a (y)||, 

llyll 

so M/M remains bounded as y -► 0. Using (13) and (16) we obtain the Taylor 
formula 

l»(a + y) - h(a) = f'(b)[g'(a)(y)] + ||y|| E(y), 

where E(y) -+ 0 as y -+ 0. This proves that h is differentiable at a and that its 
total derivative at a is the composition f'(b) ° g'(a). 


12.10 MATRIX FORM OF THE CHAIN RULE 
The chain rule states that 


h'(a) = f '(b) o g'(a), (18) 

where h = f o g and b = g(a). Since the matrix of a composition is the product 
of the corresponding matrices, (18) implies the following relation for Jacobian 
matrices : 

Dh(a) = Df(b)Dg(a). (19) 

This is called the matrix form of the chain rule. It can also be written as a set of 
scalar equations by expressing each matrix in terms of its entries. 

Specifically, suppose that a 6 R p , b = g(a) 6 R", and f (b) e R m . Then h(a) 6 R m 
and we can write 


g • • • > 9n)t f (A, • • • >fm)> 


b • • * j ^m) - 


Then Dh(a) is an m x p matrix, Df(b) is an m x n matrix, and Dg(a) is an n x p 
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matrix, given by 


Dh(a) = [Djh&W 


1» 


Df(b) = [D k M b)]^=„ Dg(a) = a)]?J_,. 


The matrix equation (19) is equivalent to the mp scalar equations 


n 


Djh,( a) = ^2 D k fi(b)Djg k (a.), for i = 1, 2, . . . , m and j = 1,2,..., p. (20) 


fc = i 


These equations express the partial derivatives of the components of h in terms of 
the partial derivatives of the components of f and g. 

The equations in (20) can be put in a form that is easier to remember. Write 
y = f(x) and x = g(t). Then y = f[g(t)] = h(t), and (20) becomes 


n 


= E 


dy t dx, 


dy. 


dt: *=i dx k dtj 


( 21 ) 


where 


dyi 

dt; 


Djhi, 


dyt 

dxv 


= D k f h and 


dx* 

dtj 


— Dj9k- 


Examples. Suppose m = 1 . Then both / and h = f ° g are real-valued and there are p 
equations in (20), one for each of the partial derivatives of h: 


n 


Djh( a) = ^2 D kf(b)Djg k (a), j = 1,2 ,...,p. 


k= 1 


The right member is the dot product of the two vectors V/(b) and £>jg(a). In this case 
Equation (21) takes the form 


dy 


-£ 


dy 8x t 


dtj k=i dx k dtj 
In particular, if p = 1 we get only one equation, 


j 1 , 2 ,..., p. 


A'(a) = J2 = V/( b) • Dg(a), 


k= 1 


where the Jacobian matrix Dg(a) is a column vector. 

The chain rule can be used to give a simple proof of the following theorem for 
differentiating an integral with respect to a parameter which appears both in the 
integrand and in the limits of integration. 

Theorem 12.8* Let f and D 2 f be continuous on a rectangle [a, b~\ x [c, d\ Let p 
and q be differentiable on [c, d\ where p(y) e [ a , 6] and q(y) e [ a , b ] for each y in 
[c, d\ Define F by the equation 


F{y) = 




f(x, y) dx, if ye [c, d\ 
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Then F'(y) exists for each y in (c, d) and is given by 

f*4(y) 

F’(y) = D 2 f(x, y) dx + f(q(y), y)q'(y) - f(p(y), y)p'(y). 

Jp(y) 

Proof. Let G(x 1 , x 2 , x 3 ) = f(t, x 3 ) dt whenever x l and x 2 are in [a, F] and 
x 3 e [c, </]. Then F is the composite function given by F(y) = G(p(y), q(y), y). 
The chain rule implies 

F\y) = D x G{p(y\ q(y\y)p'{y) + D 2 G(p(y), q(y), y)q'(y) + D 3 G(p(y), q(y), y). 

By Theorem 7.32, we have D t G{xi, x 2 , x 3 ) = -/(*„ x 3 ) and D 2 G(x u x 2 , x 3 ) = 
f(x 2 , x 3 ). By Theorem 7.40, we also have 


D 3 G(x u x 2 , x 3 ) 


f F> 2 f(t, x 3 ) dt. 


JXi 


Using these results in the formula for F'(y) we obtain the theorem. 


12.11 THE MEAN-VALUE THEOREM FOR DIFFERENTIABLE FUNCTIONS 

The Mean-Value Theorem for functions from R 1 to R 1 states that 

f(y) - f(x) = f(z)(y - x), (22) 

where z lies between x and y . This equation is false, in general, for vector-valued 
functions from R" to R w , when m > 1. (See Exercise 12.19.) However, we will 
show that a correct equation is obtained by taking the dot product of each member 
of (22) with any vector in R m , provided z is suitably chosen. This gives a useful 
generalization of the Mean- Value Theorem for vector- valued functions. 

In the statement of the theorem we use the notation L(x, y) to denote the line 
segment joining two points x and y in R". That is, 

L(x, y) = {/x + (1 - t)y :0 < t < 1}. 

Theorem 12.9 (Mean-Value Theorem .) Let S be an open subset of R" and assume 
that f : £ — ► R w is differentiable at each point of S. Let x and y be two points in S 
such that L(x, y) <= s . Then for every vector a in R w there is a point z in L(x, y) 
such that 

a • {f(y) - f(x)} = a • {f'(z)(y - x)}. (23) 

Proof Let u = y — x. Since S is open and L(x, y) c s , there is a S > 0 such 
that x + tu e S for all real t in the interval ( — 8 , 1 -l- 5). Let a be a fixed vector in 
R w and let F be the real-valued function defined on (—5, 1 + 8) by the equation 

F(t) = a-f(x -|- /u). 

Then Fis differentiable on ( — 3, 1 + 8) and its derivative is given by 

F\t) = a-f'(x + m; u) = a*{f(x + /u)(u)}. 
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By the usual Mean-Value Theorem we have 

F(l) - F(0) = F'(0), where 0 < 0 < 1. 

Now 

F'(9) = a • {f'(x + 0u)(u)} = a • {f'(z)(y - x)}, 

where z = x + 0u e L(\, y). But F(l) — F(0) = a • {f(y) — f(x)}, so we obtain 
(23). Of course, the point z depends on F, and hence on a. 

note. If S is convex, then L(x, y )sS for all x, y in S’ so (23) holds for all x and 
y in S. 

Examples 

1. If /is real-valued (m = 1) we can take a = 1 in (23) to obtain 

/( y) - /(x) = /' (z)(y - x) = V/(z) • (y - x). (24) 

2. If f is vector-valued and if a is a unit vector in R 1 ", ||a| = 1, Eq. (23) and the Cauchy- 
Schwarz inequality give us 

||f(y) - f (x) II < ||f'(z)(y - x)||. 

Using (12) we obtain the inequality 

||f(y) - f(x)|| < M||y - x||, 

where M = EJLx I Y/*(z) || . Note that M depends on z and hence on x and y. 

3. If S is convex and if all the partial derivatives Djf k are bounded on S, then there is a 
constant A > 0 such that 

II f(y) - f(x)| < /l||y - x||. 

In other words, f satisfies a Lipschitz condition on S. 

The Mean-Value Theorem gives a simple proof of the following result concern- 
ing functions with zero total derivative. 

Theorem 12.10. Let S be an open connected subset of R", and let f : S -*• R m be 
differentiable at each point of S. //T'(c) = 0 for each c in S, then f is constant on S. 

Proof Since S’ is open and connected, it is polygonally connected . (See Section 
4.18.) Therefore, every pair of points x and y in S can be joined by a polygonal 
arc lying in S. Denote the vertices of this arc by Pi, . . . , p r , where pj = x and 
p r = y. Since each segment L(p i+1 , p ( ) £ S, the Mean-Value Theorem shows that 

a-{f(p i+ i) - f(Pi)} = 0, 

for every vector a. Adding these equations for / = 1,2 1, we find 

a • {f(y) - f(x)} = 0, 

for every a. Taking a = f(y) — f(x) we find f(x) = f(y), so f is constant on S. 
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12.12 A SUFFICIENT CONDITION FOR DIFFERENTIABILITY 

Up to now we have been deriving consequences of the hypothesis that a function 
is differentiable. We have also seen that neither the existence of all partial deriv- 
atives nor the existence of all directional derivatives suffices to establish differ- 
entiability (since neither implies continuity). The next theorem shows that 
continuity of all but one of the partials does imply differentiability. 

Theorem 12.11. Assume that one of the partial derivatives Df, . . . , D„f exists at c 
and that the remaining n - 1 partial derivatives exist in some n-ball 5(c) and are 
continuous at c. Then f is differentiable at c. 

Proof. First we note that a vector-valued function f = (/ , . . . , /J is differentiable 

at c if, and only if, each component/* is differentiable at c. (The proof of this is an 

easy exercise.) Therefore, it suffices to prove the theorem when f is real-valued. 

For the proof we suppose that DJ(c) exists and that the continuous partials 
are D 2 f . . . , DJ. 

The only candidate for /'(c) is the gradient vector V/(c). We will prove that 
/(c + ▼) - /(c) = V/(c) • v + o(||v||) as v -*■ 0, 

and this will prove the theorem. The idea is to express the difference/(c + v) — /(c) 
as a sum of n terms, where the Ath term is an approximation to £>*/( c)v k . 

For this purpose we write v = Ay, where ||y|| = 1 and X = ||v||. We keep X 
small enough so that c + v lies in the ball 5(c) in which the partial derivatives 
D 2 f , DJ exist. Expressing y in terms of its components we have 

y = JiUj + • • • + y„ u„, 

where u* is the Arth unit coordinate vector. Now we write the diflference/(c + v) - 
/(c) as a telescoping sum, 

n 

/( c + v ) - /( c ) = /( c + *y) - /(C) = {/( c + ^V*) - /( C + Av*_i)}, (25) 

k— 1 

where 

v 0 = 0, Vj = j^Uj, v 2 = yjUi 4- y 2 u 2 , . . . , v„ = + • • • + y B u„. 

The first term in the sum is /( c + AyjUj) — /(c). Since the two points c and 

c + AyjUj differ only in their first component, and since DJ{ c) exists, we can 
write 

/( c + ^iU t ) - /(c) = Xy x DJ{c) + XyJfX), 

where E x (X) -> 0 as X -► 0. 

For k > 2, the Arth term in the sum is 

/(c + Av*_i + Xy k u*) -/( c + Av*_ j) = /(b* + Xy k u*) - /( b*), 

where b* = c +-Av (t _ 1 . The two points b* and b* + Ay*u* differ only in their Ath 
component, and we can apply the one-dimensional Mean-Value Theorem for 
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derivatives to write 

f(b k -I- hy k u k ) — f(b k ) = Ay fc Z> fc /(a fc ), (26) 

where a fc lies on the line segment joining b k to b k + Xy k u fc . Note that b k -► c and 
hence a fc -> c as X — ► 0. Since each D k f is continuous at c for k ^ 2 we can write 

D k f(a k ) = D fc /(c) + ^(A), where £*(2) -> 0 as A -► 0. 

Using this in (26) we find that (25) becomes 

n n 

/( c + v) - /(c) = k ^2 D k f(c)y k + k ^ y*£*(A) 

fc=l k=l 

= V/(c) • v + || v || £ (A), 

where 

n 

E(X) = 2 ->■ 0 as IMI -> 0. 

fc= 1 

This completes the proof. 

note. Continuity of at least w — 1 of the partials . . . , Z> n f at c, although 
sufficient, is by no means necessary for differentiability of f at c. (See Exercises 
12.5 and 12.6.) 


12.13 A SUFFICIENT CONDITION FOR EQUALITY OF MIXED PARTIAL 
DERIVATIVES 

The partial derivatives D x f, . . . , D n f of a function from R" to R m are themselves 
functions from R" to R m and they, in turn, can have partial derivatives. These are 
called second-order partial derivatives. We use the notation introduced in Chapter 
5 for real- valued functions: 

D r J = D r (D k f) = . 

CX r CX k 


Higher-order partial derivatives are similarly defined. 

The example 

a \ = f *^* 2 ~ > ,2 )/( x2 + y 2 ) if y ) * (°» °)» 

HX,y) 10 if (x, y) = (0, 0), 


shows that D 12 f(x, y) is not necessarily the same as D 2y if(x, y). In fact, in this 
example we have 


Pif(x, y) 


y(x 4 + 4x 2 y 2 - y 4 ) 
(x 2 + y 2 ) 2 


if (x, y) ¥= (0, 0), 
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and Z)j/( 0, 0) = 0. Hence, D 1 f(0, y) = —y for all y and therefore 

t>2Amy)=-h Z) 2>1 /(0,0) = -l. 

On the other hand, we have 

£*/<*, = xix '~ 4x ‘ y2 ~ y4> , if (x, ,) * (0, 0), 

(x + y y 

and D 2 f(0, 0) = 0, so that D 2 f(x, 0) = x for all x. Therefore, D 12 f(x, 0) = 1, 
D 2 >2 /(0, 0) = 1, and we see that D 21 f(0, 0) ^ D 1<2 f(0, 0). 

The next theorem gives us a criterion for determining when the two mixed 
partials D x 2 f and Z) 21 f will be equal. 

Theorem 12.12. If both partial derivatives D r i and D k i exist in an n-ball B( c; 5) and 
if both are differentiable at c, then 

Dr. k f(c) = A, f(c). (27) 

Proof If f = (/j, . . . ,/ m ), then D k f = (D k f , . . . , D k f m ). Therefore it suffices 
to prove the theorem for real-valued /. Also, since only two components are 
involved in (27), it suffices to consider the case n = 2. For simplicity, we assume 
that c = (0, 0). We shall prove that 

A l2 /(0, 0) = Z) 2>1 /(0, 0). 

Choose h ^ 0 so that the square with vertices (0, 0), ( h , 0), (h, h), and (0, h) 
lies in the 2-ball 5(0; 5). Consider the quantity 

A(h) = f(h, h) - f(h, 0) - /(0, h) + /(0, 0). 

We will show that A (h)jh 2 tends to both D 2 ,/(0, 0) and D, 2 f( 0, 0) as h -* 0. 

Let <7(x) = f(x, h) — f(x, 0) and note that 

A (h) = G(h) - G{ 0). (28) 

By the one-dimensional Mean-Value Theorem we have 

G(h) - (7(0) = hG' (x,) - h{D x f{x „ h) - DJ(x u 0)}, (29) 

where x x lies between 0 and h. Since D i f is differentiable at (0, 0), we have the 
first-order Taylor formulas 

A/(*„ h) = DJ( 0, 0) + D ul f(0, 0)x t + D 2>1 /( 0, 0 )h + (x\ + hy^E^h), 
and 

DJ{x ly 0) = D t f(0, 0) + D ul f(0, 0)x, + |x,| E 2 {h), 
where E t (h) and E 2 (h) 0 as h -* 0. Using these in (29) and (28) we find 

A (h) = D 2>1 /(0, 0 )h 2 + E(h), 
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where E(h ) = h(x\ -I- h 2 y ,2 E x (h) -I- h\x x \ E 2 (h). Since \x x \ < \h\, we have 


so 


0 < \E(h)\ <; V 2 h 2 | £,(/»)! + h 2 \E 2 {h)\, 


lim 

h->0 


m 

h 2 


D 2 ,J(0, 0). 


Applying the same procedure to the function H(y) = f(h, y) — /( 0, y) in 
place of G(x), we find that 


lim ^ = D lt2 f(0, 0), 

h-> 0 fl 


which completes the proof. 

As a consequence of Theorems 12.11 and 12.12 we have : 


Theorem 12.13. If both partial derivatives D r f and D k f exist in an n-ball 5(c) and 
if both D r k f and D k r f are continuous at c, then 

D r , k f(c) = £*/(<:). 

note. We mention (without proof) another result which states that if D r f, D k f and 
D kjr f are continuous in an n-ball 5(c), then D r k f(c) exists and equals D k<r f(c). 

If / is a real-valued function of two variables, there are four second-order 
partial derivatives to consider; namely, D i 2 f, D 21 f, and D 22 f. We have 

just shown that only three of these are distinct if / is suitably restricted. 

The number of partial derivatives of order k which can be formed is 2 k . If all 
these derivatives are continuous in a neighborhood of the point (x, y), then 
certain of the mixed partials will be equal. Each mixed partial is of the form 
D ri , . . . , Tk f, where each r } is either 1 or 2. If we have two such mixed partials, 
D ri , . . . , Tk f and D Pi , . . . , Pk f, where the fc-tuple (r u . . . , r k ) is a permutation of 
the fc-tuple (p lf , p k ), then the two partials will be equal at (x, y) if all 2 k partials 
are continuous in a neighborhood of (x, y). This statement can be easily proved 
by mathematical induction, using Theorem 12.13 (which is the case k = 2). We 
omit the proof for general k. From this it follows that among the 2 k partial 
derivatives of order k, there are only k + 1 distinct partials in general, namely, 
those of the form D ri , ... , Tk f, where the fc-tuple (r u ... , r k ) assumes the following 
k + 1 forms: 



( 1 , 2 , 2 , ... , 2 ), 




Similar statements hold, of course, for functions of n variables. In this case, 
there are n k partial derivatives o£ order k that can be formed. Continuity of all 
these partials at a point x implies that D n , . . . , , k /(x) is unchanged when the 
indices r u ... ,r k are permuted. Each r t is now a positive integer <n. 
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12.14 TAYLOR’S FORMULA FOR FUNCTIONS FROM R" TO R 1 

Taylor’s formula (Theorem 5.19) can be extended to real-valued functions / defined 
on subsets of R". In order to state the general theorem in a form which resembles 
the one-dimensional case, we introduce special symbols 

/"(x;t), /'"(x; t), . . . ,/ (m) (x; t), 

for certain sums that arise in Taylor’s formula. These play the role of higher- 
order directional derivatives, and they are defined as follows : 

If x is a point in R" where all second-order partial derivatives of f exist, and if 
t = (t u . . . , t„) is an arbitrary point in R", we write 

n n 

/"(x; <) = EE D ij f(x)t J t i . 

;=i ;= i 

We also define 

/'"(x; t) = ^2 Dij bfixfibtjti 

i= 1 j= 1 k= 1 

if all third-order partial derivatives exist at x. The symbol / (m) (x; t) is similarly 
defined if all mth-order partials exist. 

These sums are analogous to the formula 

n 

/'(x; t) = £ 

i—1 

for the directional derivative of a function which is differentiable at x. 

Theorem 12.14 ( Taylor* s formula ). Assume that f and all its partial derivatives of 
order <m are differentiable at each point of an open set S in R". If a and b are two 
points of S such that L(a, b) c S, then there is a point z on the line segment L{ a, b) 
such that 

m-l 

/(b) - /(a) = 22 i- /<‘>(a; b - a) + — / (m, (z; b - a). 

k= i k! ml 

Proof. Since S is open, there 4s a 5 > 0 such that a + t(b — a) e 5 for all real 
t in the interval —5<t< 1 -I- 5. Define g on (— 8, 1 + <5) by the equation 

9it) - /[« + t { b - a)]. 

Then /(b) — /(a) = g( 1) — #(0). We will prove the theorem by applying the 
one-dimensional Taylor formula to g, writing 

m-l j 

$r(l) - 0(0) = X) 77 0 (k> (°) + — 9 (m) (0)> where 0 < 6 < 1. (30) 

k — i k! ml 

Now g is a composite function given by g(t) — /[p(0], where p(t) = a + f(b — a). 
The A^th component of p has derivative p' k (t) = b k — a k . Applying the chain rule, 
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we see that g'(t ) exists in the interval (-S, l + §) and is given by the formula 

n 

9\t) = £ Z) ; /[p(0](&, - «y) = /'(p(0; b - a). 

1 

Again applying the chain rule, we obtain 

9"(t ) = - «jXb« - «.) = rm; b - a). 

i=l j=l 

Similarly, we find that g (m) (t) = / (m) ( p(/); b — a). When these are used in (30) 
we obtain the theorem, since the point z = a 4- 0 ( b — a) e L( a, b). 


EXERCISES 


Differentiable functions 

12.1 Let S be an open subset of R", and let /: S -+ R be a real-valued function with 

finite partial derivatives D n f on S. If / has a local maximum or a local minimum 

at a point c in S, prove that D k f{ c) = 0 for each k. 

12.2 Calculate all first-order partial derivatives and the directional derivative /'(x; u) 
for each of the real-valued functions defined on R" as follows: 

a) /(x) = a • x, where a is a fixed vector in R”. 

b) /(x) = ||xH 4 . 

c) /(x) = x • L(x), where L : R" -+ R" is a linear function. 


d) /(x) = 


n n 

11, °u x i x p 
(= i j = i 


where a tJ = a jt . 


12.3 Let f and g be functions with values in R m such that the directional derivatives 
f'(c; u) and g'(c; u) exist. Prove that the sum f + g and dot product f • g have directional 
derivatives given by 


and 


(f + g)'(c; u) = f'(c; u) + g'(c; u) 

(f • g)'(c ; u) = f(c) • g'(c; u) + g(c) • f'(c; u). 


12.4 If S £ R", let f : S -+ R m be a function with values in R m , and write f = (/ l5 . . . ,f m ). 
Prove that f is differentiable at an interior point c of S if, and only if, each f t is differentiable 
at c. 

12.5 Given n real -valued functions f u . . . ,f n , each differentiable on an open interval 
(a 9 b) in R. For each x = (x x , . . . , x n ) in the /i-dimensional open interval 

S = {(*!, . . . , x n ) : a < x k < b, k = 1,2 

define /(x) = fi(x x ) + • • • + f n (x n ). Prove that /is differentiable at each point of S and 
that 

n 

/'(x)(u) = ^2 fi(x,)u t , where u = (u u . . . , «„). 

1 = 1 
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12.6 Given n real-valued functions f u . . . , f n defined on an open set S in R". For each x 
in S, define /(x) = f x (x) + • • • + f n (x). Assume that for each k = 1 , 2, . . . , n, the 
following limit exists: 


lim 

y-+x 

yk±x k 


fk( y) - /*(*) 
yk - x k 


Call this limit a k (x). Prove that / is differentiable at x and that 


/'(*)( u) = ^ a k(x)u k if u = («!, .... «„). 

k=l 

12.7 Let f and g be functions from R" to R m . Assume that f is differentiable at c, that 
f(c) = 0, and that g is continuous at c. Let h(x) = g(x) • f(x). Prove that h is differen- 
tiable at c and that 

h'{ c)(u) = g(c) • {f'(c)(u)} if u g R". 

12.8 Let f : R 2 -► R 3 be defined by the equation 

f(x, y ) = (sin x cos y, sin x sin y, cos x cos y). 

Determine the Jacobian matrix Df(x, y). 

12.9 Prove that there is no real-valued function / such that /'( c; u) > 0 for a fixed point 
c in R" and every nonzero vector u in R". Give an example such that /'( c; u) > 0 for a 
fixed direction u and every c in R". 

12.10 Let / = u + iv be a complex-valued function such that the derivative f\c) exists 
for some complex c. Write z = c + re iCL (where a is real and fixed) and let r -> 0 in the 
difference quotient [f(z) — /(c)]/(z — c) to obtain 


/'(c) = e la [u'(c; a) + iv'(c ; a)]. 


where a = (cos a, sin a), and u\c\ a) and v\c\ a) are directional derivatives. Let b = 
(cos /?, sin /?), where fi = a + and show by a similar argument that 


f\c) = e ia [v'(c ; b) - iu\c\ b)]. 


Deduce that u\c\ a) = v\c\ b) and v'(c; a) = —u'{c\ b). The Cauchy-Riemann equa- 
tions (Theorem 5.22) are a special case. 


Gradients and the chain rule 

12.11 Let / be real-valued and differentiable at a point c in R", and assume that 
||V/(c)|| !=■ 0. Prove that there is one and only one unit vector u in R" such that 
|/'(c; u)| = || V/(c)||, and that this is the unit vector for which |/'(c; u)| has its maximum 
value. 

12.12 Compute the gradient vector V/(x, y) at those points (x, y) in R 2 where it exists: 

a) f(x, y ) = x 2 y 2 log (x 2 + y 2 ) if (x, y) * (0, 0), /( 0, 0) = 0. 

b) fix, y) = xy sin — — — - if (x, y) ^ (0, 0), /( 0, 0) = 0. 

x 2 + y 2 
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12.13 Let / and g be real-valued functions defined on R 1 with continuous second deriva- 
tives f" and g". Define 

F(x, y ) = f[x + g{y)] for each (x, y) in R 2 . 

Find formulas for all partials of F of first and second order in terms of the derivatives of 
/ and g. Verify the relation 

(D X F)(D U2 F) = (D 2 F)(D ltl F). 

12.14 Given a function / defined in R 2 . Let 

F(r, 9) = /(r cos 9, r sin 9). 

a) Assume appropriate differentiability properties of / and show that 

D x F(r 9 9) = cos 9 D x f(x , y) + sin 9 D 2 f(x , y) 9 

D ltl F(r,9) = cos 2 9D lA f(x,y) + 2 sin 0cos 9D l2 f{x 9 y) + sin 2 9D 2a f(x,y), 
where x = r cos 9, y — r sin 9. 

b) Find similar formulas for D 2 F 9 D l2 F 9 and D 2 2 F, 

c) Verify the formula 

||V/(r cos 0 , r sin 0)\\ 2 = [D^r, 0)f + \ [D z F(r, 0)f. 

r * 

12.15 If / and g have gradient vectors V/(x) and V^(x) at a point x in R" show that the 
product function h defined by h(\) = f(x)g(x) also has a gradient vector at x and that 

VA(x) = f(x)Vg(x) + g(x)Vf(x). 

State and prove a similar result for the quotient fjg. 

12.16 Let / be a function having a derivative /' at each point in R 1 and let g be defined 
on R 3 by the equation 

g{x, y, z) = x 2 + y 2 + z 2 . 

If h denotes the composite function h = f og 9 show that 

II VA(x, y, z)|| 2 = 4 g(x, y, z){f'[g(x, y, z)]} 2 . 

12.17 Assume / is differentiable at each point (x, y) in R 2 . Let g x and g 2 be defined on 
R 3 by the equations 

g x (x, y, z) = x 2 *+■ y 2 *+■ z 2 , g 2 (x 9 y, z) = x + y + z, 

and let g be the vector-valued function whose values (in R 2 ) are given by 

g(x, y 9 z) = (g x (x 9 y 9 z), g 2 (x 9 y 9 z)). 

Let h be the composite function h = f° g and show that 

II VA I 2 = 4 {DJ) 2 g x + 4 (DJ)(D 2 f)g 2 + 3 (Z> 2 /) 2 . 

12.18 Let /be defined on an open set S in R". We say that /is homogeneous of degree p 
over S if /(Ax) = A p / (x) for every real A and for every x in S for which Ax e S. If such a 



Exercises 




function is differentiable at x, show that 

x • V/(x) = pf(x). 

note. This is known as Euler's theorem for homogeneous functions. Hint. For fixed x, 
define g{X) = /(Ax) and compute g\ 1). 

Also prove the converse. That is, show that if x- V/(x) = pf(x) for all x in an open 
set S, then / must be homogeneous of degree p over S. 

Mean-Value theorems 

12.19 Let f : R -+ R 2 be defined by the equation f(f) = (cos t , sin t). Then f'(t)(u) = 
w(-sin t, cos t) for every real u. The Mean-Value formula 

f(j>) - f(x) = f'(z)(y - x ) 

cannot hold when x = 0, y = 2n, since the left member is zero and the right member is a 
vector of length 2n. Nevertheless, Theorem 12.9 states that for every vector a in R 2 there 
is a z in the interval (0, 2n) such that 

a * {f(jO - f(x)} = a • {f'O z)(y - x)}. 

Determine z in terms of a when x = 0 and y = 2n. 

12.20 Let /be a real-valued function differentiable on a 2-ball £(x). By considering the 
function 

9(t) = f[ty x + (1 - t)x u y 2 ] + /[*!, ty 2 + (1 - t)x 2 ] 

prove that 


/(y) - /(x) = (y x - xJDifizt, y 2 ) + ( y 2 - x 2 )D 2 f(x u z 2 ), 
where z x e L(x u yj and z 2 e L(x 2 , y 2 ). 

12.21 State and prove a generalization of the result in Exercise 12.20 for a real- valued 
function differentiable on an h- ball 2?(x). 

12.22 Let /be real-valued and assume that the directional derivative /'( c + tu; u) exists 
for each t in the interval 0 < t < 1. Prove that for some 6 in the open interval (0, 1) we 
have 

/( c + u) - /(c) = /'( c + 0u; u). 

12.23 a) If/ is real-valued and if the directional derivative /'(x; u) = 0 for every x in an 

w-ball 2?(c) and every direction u, prove that /is constant on B( c). 

b) What can you conclude about /if /'(x; u) = 0 for a fixed direction u and every 
x in 2?(c)? 

Derivatives of higher order and Taylor’s formula 

12.24 For each of the following functions, verify that the mixed partial derivatives D l2 f 
and Z> 2 ,i/are equal. 

a) f(x , y) = x 4 + y 4 - 4 x 2 y 2 . 

i 

b) fix, y) = log (x 2 + y 2 ), (x, y) j± (0, 0). 

c) fix, y) = tan ix 2 /y), y * 0. 
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12.25 Let / be a function of two variables. Use induction and Theorem 12.13 to prove 
that if the 2* partial derivatives of / of order k are continuous in a neighborhood of a point 

( x , y\ then all mixed partials of the form D ri r J and D Pi Pk f will be equal at (jc, y) 

if the fc-tuple (r 1# . . . , r k ) contains the same number of ones as the A>tuple (p l9 . . . , p k ). 

12.26 If / is a function of two variables having continuous partials of order k on some 
open set S in R 2 , show that 

/ (t) (x; t) = ^ t[tl~ r D Pl , .... Pk /(x), if x e S, t = t 2 ), 

where in the rth term we have p x =••• — p r — 1 and p r+1 = • • • = p k = 2. Use this 
result to give an alternative expression for Taylor’s formula (Theorem 12.14) in the case 
when n = 2. The symbol ( k ) is the binomial coefficient k\j[r\ (k — r)!]. 

12.27 Use Taylor’s formula to express the following in powers of (x - 1) and (y — 2): 

a) /(*, y) = x 3 + y 3 + xy 2 , b) f(x , y) = x 2 + xy + y 2 . 
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CHAPTER 13 


IMPLICIT FUNCTIONS 
AND EXTREMUM PROBLEMS 


13.1 INTRODUCTION 

This chapter consists of two principal parts. The first part discusses an important 
theorem of analysis called the implicit function theorem; the second part treats 
extremum problems. Both parts use the theorems developed in Chapter 12. 

The implicit function theorem in its simplest form deals with an equation of 
the form 

f(x, t) = 0. (1) 

The problem is to decide whether this equation determines x as a function of t. 
If so, we have 

x = git), 

for some function g. We say that g is defined “implicitly” by (1). 

The problem assumes a more general form when we have a system of several 
equations involving several variables and we ask whether we can solve these 
equations for some of the variables in terms of the remaining variables. This is 
the same type of problem as above, except that x and t are replaced by vectors, 
and / and g are replaced by vector-valued functions. Under rather general con- 
ditions, a solution always exists. The implicit function theorem gives a description 
of these conditions and some conclusions about the solution. 

An important special case is the familiar problem in algebra of solving n linear 
equations of the form 

n 

Z] a u x t =• h 0 = 1, 2, ... , n), (2) 

j=i 

where the a tJ and t t are considered as given numbers and x u . . . , x n represent 
unknowns. In linear algebra it is shown that such a system has a unique solution 
if, and only if, the determinant of the coefficient matrix A = [fly] is nonzero. 

note. The determinant of a square matrix A = [ay] is denoted by det A or 
det [<iy]. If det [fly] ^ 0, the solution of (2) can be obtained by Cramer’s rule 
which expresses each x k as a quotient of two determinants, say x k = AJD, where 
D = det [ay] and A k is the determinant of the matrix obtained by replacing the 
fcth column of [fly] by t x , . . . , t n . (For a proof of Cramer’s rule, see Reference 
13.1, Theorem 3.14.) In particular, if each t { = 0, then each x k — 0. 
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Next we show that the system (2) can be written in the form (1). Each equation 
in (2) has the form 


fi(x, t) = 0 where x = (x lf . . . , x„), t = (t u . . . , Q, 

and 

n 

yj(x, t) = X) a a x r ~ 

j = i 

Therefore the system in (2) can be expressed as one vector equation f(x, t) = 0, 
where f = (f u . . . ,/„). If Djf t denotes the partial derivative of / ; with respect to 
the yth coordinate x Jf then Djfix, t) = a (J . Thus the coefficient matrix A = [a ;j ] 
in (2) is a Jacobian matrix. Linear algebra tells us that (2) has a unique solution if 
the determinant of this Jacobian matrix is nonzero. 

In the general implicit function theorem, the nonvanishing of the determinant 
of a Jacobian matrix also plays a role. This comes about by approximating f by 
a linear function. The equation f(x, t) = 0 gets replaced by a system of linear 
equations whose coefficient matrix is the Jacobian matrix of f. 

notation. If f = (/,, ...,/„) and x = (x x , . . . , x„), the Jacobian matrix 
Df(x) = [JJy/i(x)] is an n x n matrix. Its determinant is called a Jacobian 
determinant and is denoted by / f (x). Thus, 

7 f (x) = det Df(x) = det [Z)y/ t (x)]. 

The notation 

d(/i, •••,/„) 

• j X n) 

is also used to denote the Jacobian determinant 7 f (x). 

The next theorem relates the Jacobian determinant of a complex-valued 
function with its derivative. 

Theorem 13.1. Iff=u + ivisa complex-valued function with a derivative at a 
point z in C, then Jf{z) = \f'{z)\ 2 . 

Proof. We have f'(z) = D t u + iD t v, so \ f{z)\ 1 = (D x u) 2 + (D x v) 2 . Also, 

JAz) = det | Dl “ ^ 2U 1 = j) lU d 2V _ d iV d 2 u = (D^u) 2 + (D^v) 2 , 

L Dio D 2 v J 

by the Cauchy-Riemann equations. 


13.2 FUNCTIONS WITH NONZERO JACOBIAN DETERMINANT 

This section gives some properties of functions with nonzero Jacobian determinant 
at certain points. These results will be used later in the proof of the implicit function 
theorem. 
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Theorem 13.2 . Let B = J?(a; r) be an n-ball in R", let dB denote its boundary , 

dB = {x : || x - a|| = r}, 

and let B = B u dB denote its closure. Let f = (/ l5 . . . ,/„) be continuous on B, 
and assume that all the partial derivatives Djf/x) exist if xe B. Assume further 
that f(x) ¥= f(a) if xe dB and that the Jacobian determinant J f (x) 0 for each 
x in B. Then f (B), the image of B under f, contains an n-ball with center at f(a). 

Proof Define a real-valued function g on dB as follows: 

g(x) = ||f(x) - f(a)|| if x e dB. 

Then g(x) > 0 for each x in dB because f(x) # f(a) if x e dB. Also, g is continuous 
on dB since f is continuous on B. Since dB is compact, g takes on its absolute 
minimum (call it m) somewhere on dB. Note that m > 0 since g is positive on dB. 
Let T denote the n-ball 


T = fi/f(a); . 

We will prove that T £ f(B) and this will prove the theorem. (See Fig. 13.1.) 

To do this we show that y e T implies y e f (B). Choose a point y in T, keep 
y fixed, and define a new real-valued function h on B as follows : 

h(x) = ||f(x) - y|| if x e B. 

Then h is continuous on the compact set B and hence attains its absolute minimum 
on B. We will show that h attains its minimum somewhere in the open n-ball B. 
At the center we have h{ a) = ||f(a) — y|| < m/2 since y e T. Hence the minimum 
value of h in B must also be <m/2. But at each point x on the boundary dB we 
have 

h(x) = ||f(x) - y|| = ||f(x) - f(a) - (y - f(a))|| 

> ||f(x) - f(a)|| - ||f(a) - y|| > g(x) - ™ > j , 

so the minimum of h cannot occur on the boundary dB . Hence there is an interior 
point c in B at which h attains its minimum. At this point the square of h also has 
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a minimum. Since 

n 

h 2 (x) = ||f(x) - y|| 2 = XI t/r(x) - y r f, 

r= 1 

and since each partial derivative D k (h 2 ) must be zero at c, we must have 

n 

XI [/r( c ) - yr~\ D kfr(c) = 0 for k = 1 , 2, . . . , n. 

r = 1 

But this is a system of linear equations whose determinant 7 f ( c) is not zero, since 
c e B. Therefore f r (c ) = y r for each r, or f(c) = y. That is, y e f(B). Hence 
T ^ f(j?) and the proof is complete. 

A function /: S -► T from one metric space (S, d s ) to another ( T , d T ) is 
called an open mapping if, for every open set A in S, the image f(A) is open in T. 

The next theorem gives a sufficient condition for a mapping to carry open sets 
onto open sets. (See also Theorem 13.5.) 

Theorem 13.3 . Let A be an open subset of R" and assume that f : A -► R n is con- 
tinuous and has finite partial derivatives Djf on A. If f is one-to-one on A and if 
J f (x) ^ 0 for each x in A, then f (A) is open. 

Proof If b e f (A), then b = f(a) for some a in A. There is an n- ball B( a; r) ^ A 
on which f satisfies the hypotheses of Theorem 13.2, so f(i?) contains an n- ball 
with center at b. Therefore, b is an interior point of f (A), so f (A) is open. 

The next theorem shows that a function with continuous partial derivatives is 
locally one-to-one near a point where the Jacobian determinant does not vanish. 

Theorem 13.4. Assume that f = (fi , . . . , f„) has continuous partial derivatives 
Djf on an open set S in R n , and that the Jacobian determinant J t { a) ^ 0 for some 
point a in S . Then there is an n-ball B( a) on which f is one-to-one . 

Proof Let Z l5 . . . , Z n be n points in S and let Z = (Z t ; . . . ; Z n ) denote that 
point in R” 2 whose first n components are the components of Z l5 whose next n 
components are the components of Z 2 , and so on. Define a real-valued function 
h as follows : 

K Z) = det [DJi Zdl 

This function is continuous at those points Z in R" where h( Z) is defined because 
each Djfi is continuous on S and a determinant is a polynomial in its n 2 entries. 
Let Z be the special point in R" obtained by putting 

Zj = z 2 = • • • = Z„ = a. 

Then h(Z) = J t ( a) ^ 0 and hence, by continuity, there is some n-ball B( a) such 
that det \^D J f i {Z i )'\ ^ 0 if each Z f e B( a). We will prove that f is one-to-one on 
B( a). 
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Assume the contrary. That is, assume that f(x) = f(y) for some pair of points 
x ^ y in B(&). Since B( a) is convex, the line segment Ifx, y) c B(a) and we can 
apply the Mean-Value Theorem to each component of f to write 


o = fi(y) - /i(x) = mZi) • (y - X) for / = 1, 2, . . . , n, 

where each Z,- e L(x, y) and hence Z ; e 5(a). (The Mean-Value Theorem is 

applicable because f is differentiable on 5.) But this is a system of linear equations 
of the form 


^2 (yk ~ x k )a ik = 0 with a ik = DJ|( Z,). 

k= 1 

The determinant of this system is not zero, since Z f e B(a). Hence y k - x k = 0 
for each k, and this contradicts the assumption that x # y. We have shown, 
therefore, that x # y implies f(x) ^ f(y) and hence that f is one-to-one on B( a). ' 

note. The reader should be cautioned that Theorem 13.4 is a local theorem and 
not a global theorem. The nonvanishing of / f (a) guarantees that f is one-to-one 
on a neighborhood of a. It does not follow that f is one-to-one on S, even when 
Z f (x) t* 0 for every x in S. The following example illustrates this point. Let /be 
the complex-valued function defined by f(z) = e 1 if z e C. If z = x + iy we have 

W = l/'OOl 2 = l«*l 2 = e 2 *- 

Thus J f (z) ^ 0 for every z in C. However, / is not one-to-one on C because 
f( z i) = f( z 2 ) for every pair of points z x and z 2 which differ by In i. 

The next theorem gives a global property of functions with nonzero Jacobian 
determinant. 


Theorem 13.5. Let A be an open subset of R" and assume that f: A -+ R" has 
continuous partial derivatives Djf t on A. If J t (x) ^ 0 for all x in A, then f is an 
open mapping. 

Proof Let S be any open subset of A. If x e S there is an n-ball B(x) in which f 
is one-to-one (by Theorem 13.4). Therefore, by Theorem 13.3, the image f(B(x)) 
is open in R". But we can write S’ = (J X6S B(x). Applying f we find f(S) = 
UxeS f(B(x)), so f (S) is open. 

note. If a function f = C/i , ... ,f n ) has continuous partial derivatives on a set S, 
we say that f is continuously differentiable on S, and we write feC' on S. In view 

of Theorem 12.11, continuous differentiability at a point implies differentiability 
at that point. 

Theorem 13.4 shows that a continuously differentiable function with a non- 
vanishing Jacobian at a point a has a local inverse in a neighborhood of a. The 

next theorem gives some local differentiability properties of this local inverse 
function. 
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13.3 THE INVERSE FUNCTION THEOREM 

Theorem 13.6. Assume f = (/i, • • • ,/„) 6 C' on an open set S in R", and let 
T = f(S). If the Jacobian determinant J,( a) ^ 0 /or jowe po/nf a in S, then there 
are two open sets X £ S’ and Y £ 7 1 and a uniquely determined function g swc/i that 

a) a e X and f(a) e Y, 

b) Y = f(*)> 

c) f is one-to-one on X, 

d) g is defined on Y, g (Y) = X, and g[f(x)] = xfor every x in X, 

e) geC' on y. 

Proof. The function J t is continuous on S’ and, since /,( a) ^ 0, there is an n-ball 
2?j( a) such that /,(x) ^ 0 for all x in B/a). By Theorem 13.4, there is an n-ball 
B(a) s Bj(a) on which f is one-to-one. Let B be an n-ball with center at a and 
radius smaller than that of B( a). Then, by Theorem 13.2, f (B) contains an n-ball 
with center at f(a). Denote this by Y and let X = f 1 (Y) n B. Then X is open 
since both f -1 (T) and B are open. (See Fig. 13.2.) 



The set B (the closure of B) is compact and f is one-to-one and continuous on 
B. Hence, by Theorem 4.29, there exists a function g (the inverse function f _1 of 
Theorem 4.29) defined on f(B) such that g[f(x)] = x for all x in B. Moreover, g 
is continuous on f(B). Since X c B and Y £ f(B), this proves parts (a), (b), (c) 
and (d). The uniqueness of g follows from (d). 

Next we prove (e). For this purpose, define a real-valued function h by the 
equation h( Z) = det [Djf{ Z,)], where Z lt . . . , Z„ are n points in S, and 
Z = (Z t ; . . . ; Z„) is the corresponding point in R" 2 . Then, arguing as in the proof 
of Theorem 13.4, there is an n-ball B 2 (a) such that h( Z) ^ 0 if each Z, e B 2 (a). 
We can now assume that, in the earlier part of the proof, the n-ball B( a) was chosen 
so that B(a) £ B 2 (a). Then B £ B 2 ( a) and h(Z) ^ 0 if each Z t eB. 

To prove (e), write g = (g 2 , . . . , g„). We will show that each g k e C' on Y. 
To prove that D r g k exists on Y, assume y e Y and consider the difference quotient 
[Pk(y + lv r) ~ ffk( y)]/*> where u r is the rth unit coordinate vector. (Since Y is 
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open, y + tu r e Y if t is sufficiently small.) Let x = g(y) and let x' = g(y + m r ). 
Then both x and x' are in X and f(x') - f(x) = tu r . Hence / f (x') - / t (x) is 0 if 
i 7 ^ r, and is t if i = r. By the Mean-Value Theorem we have 


fix') - m _ * 


t 


t 



n , 


where each Z f is on the line segment joining x and x' ; hence Z B. The expression 
on the left is 1 or 0, according to whether i = r or i ^ r. This is a system of n 
linear equations in n unknowns (*} - xj)jt and has a unique solution, since 

det [Djfi{Zji)\ = h{ Z) ^ 0 . 

Solving for the kth unknown by Cramer’s rule, we obtain an expression for 
[> k (y + fu r ) — # k (y)]/f as a quotient of determinants. As t 0 , the point x x, 
since g is continuous, and hence each Z f x, since is on the segment joining 
x to x'. The determinant which appears in the denominator has for its limit the 
number det [Djf^x)] = / f (x), and this is nonzero, since xel. Therefore, the 
following limit exists : 

Hm + ~ . DM- 

t -0 t 


This establishes the existence of D r g k ( y) for each y in Y and each r = 1,2 , ,n. 
Moreover, this limit is a quotient of two determinants involving the derivatives 
Djfi( x )- Continuity of the Djfi implies continuity of each partial D r g k . This 
completes the proof of (e). 


note. The foregoing proof also provides a method for computing D r g k ( y). In 
practice, the derivatives D r g k can be obtained more easily (without recourse to a 
limiting process) by using the fact that, if y = f(x), the product of the two Jacobian 
matrices Df(x) and Dg(y) is the identity matrix. When this is written out in detail 
it gives the following system of n 2 equations : 


Z! D k9i(y)Djf k (x) = 

k= 1 


1 

0 


if * =7, 
if i # j. 


For each fixed i, we obtain n linear equations as j runs through the values 
1, 2, . . . , n. These can then be solved for the n unknowns, D l g i (y), . . . , D„g { ( y), 
by Cramer’s rule, or by some other method. 


13.4 THE IMPLICIT FUNCTION THEOREM 

The reader knows that the equation of a curve in the xy-plane can be expressed 
either in an “explicit” form, such as y = f(x), or in an “implicit” form, such as 
F(x, y) = 0. However, if we are given an equation of the form F(x, y) = 0, this 
does not necessarily represent a function. (Take, for example, x 2 + y 2 — 5 = 0.) 
The equation F(x, y) = 0 does always represent a relation, namely, that set of all 
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pairs ( x, y ) which satisfy the equation. The following question therefore presents 
itself quite naturally: When is the relation defined by F(x 9 y) = 0 also a function? 
In other words, when can the equation F(x 9 y) = 0 be solved explicitly for y in 
terms of x 9 yielding a unique solution? The implicit function theorem deals with 
this question locally . It tells us that, give a point (x 0 , >>o) such that F(x 09 y 0 ) = 0, 
under certain conditions there will be a neighborhood of (x 0 , y 0 ) such that in this 
neighborhood the relation defined by F(x 9 y) = 0 is also a function. The conditions 
are that F and D 2 F be continuous in some neighborhood of ( x 09 y 0 ) and that 
D 2 F(x o, .y 0 ) ^ 0. In its more general form, the theorem treats, instead of one 
equation in two variables, a system of n equations in n + k variables : 

fr(x 1 , • • • , x n9 t\ 9 • • • J ffc) = 0 ( r == 1 9 2 , . . . , n). 

This system can be solved for x l9 . . . , x„ in terms of t l9 . . . , t k9 provided that 
certain partial derivatives are continuous and provided that the n x n Jacobian 
determinant d(f l9 . . . ,f„)ld(x i9 . . . , x n ) is not zero. 

For brevity, we shall adopt the following notation in this theorem: Points in 
(n + A;)-dimensional space R n+fc will be written in the form (x; t), where 

x = (x„ . . . , x„) e R" and t = (t u ...,t k )e R k . 

Theorem 13.7 (Implicit function theorem). Let f = (f u . . . ,/„) be a vector-valued 
function defined on an open set S in R" + * with values in R". Suppose f e C' on S. 
Let (x 0 ; t 0 ) be a point in S for which f(x 0 ; t 0 ) = 0 and for which the n x n determi- 
nant det [Dj/^Xq ; t 0 )] # 0. Then there exists a k-dimensional open set T 0 con- 
taining t 0 and one, and only one, vector-valued function g, defined on T 0 and having 
values in R", such that 

a) g e C' on T 0 , 

b) g(t 0 ) = x 0 , 

c) f(g(t); t) = 0 for every t in T 0 . 

Proof. We shall apply the inverse function theorem to a certain vector-valued 
function F = (F u ..., F„; F n+i ,..., F n+k ) defined on S and having values in 
R" + \ The function F is defined as follows: For 1 < m < n, let F m (x; t) = fjx; t), 
and for 1 <, m < k, let F„ +m (x; t) = t m . We can then write F = (f; I), where 
f = (/ l5 and where I is the identity function defined by I(t) = t for each t 

in R*. The Jacobian /^x; t) then has the same value as the n x n determinant 
det [Djffx; t)] because the terms which appear in the last k rows and also in the 
last k columns of /p(x; t) form a k x k determinant with ones along the main 
diagonal and zeros elsewhere ; the intersection of the first n rows and n columns 
consists of the determinant det [Z> } fi{x ; t)], and 

DiF„+j(x; t) = 0 for \ < i < n, 1 < j <, k. 

Hence the Jacobian /*(x 0 ; t 0 ) # 0. Also, F(x 0 ; t 0 ) = (0; t 0 ). Therefore, by 
Theorem 13.6, there exist open sets X and Y containing (x 0 ; t 0 ) and (0; t 0 ), 
respectively, such that F is one-to-one on X, and X = F 1 (T). Also, there exists 
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a local inverse function G, defined on Y and having values in X y such that 

G[F(x; t)] = (x; t), 

and such that G 6 C' on Y. 

Now G can be reduced to components as follows: G = (v; w) where 
v = v„) is a vector- valued function defined on Y with values in R" and 

w = ( w i> • • • > w k) is also defined on Y but has values in R*. We can now determine 
v and w explicitly. The equation G[F(x; t)] = (x; t), when written in terms of the 
components v and w, gives us the two equations 

v[F(x; t)] = x and w[F(x; t)] = t. 

But now, every point (x; t) in Y can be written uniquely in the form (x; t) = F(x' ; t') 
for some (x'; t') in X, because F is one-to-one on X and the inverse image F _1 (T) 
contains X. Furthermore, by the manner in which F was defined, when we write 
(x; t) = F(x'; t'), we must have t' = t. Therefore, 

v(x; t) = v[F(x' ; t)] = x' and w(x; t) = w[F(x'; t)] = t. 

Hence the function G can be described as follows: Given a point (x; t) in Y, we 
have G(x; t) = (x'; t), where x' is that point in R" such that (x; t) = F(x'; t). 
This statement implies that 

F[v(x; t); t] = (x; t) for every (x; t) in Y. 

Now we are ready to define the set T 0 and the function g in the theorem. Let 

T 0 = {t : t e R\ (0; t) e Y}, 

and for each t in T 0 define g(t) = v(0; t). The set T 0 is open in R k . Moreover, 
g e C' on T 0 because G 6 C' on Y and the components of g are taken from the 
components of G. Also, 

g(to) = v(0; t 0 ) = x 0 

because (0; to) = F(x 0 ; t 0 ). Finally, the equation F[v(x; t); t] = (x; t), which 
holds for every (x; t) in Y, yields (by considering the components in R") the 
equation f[v(x; t); t] = x. Taking x = 0, we see that for every t in T 0 , we have 
f [g(t) ; t] = 0, and this completes the proof of statements (a), (b), and (c). It 
remains to prove that there is only one such function g. But this follows at once 
from the one-to-one character of f. If there were another function, say h, which 
satisfied (c), then we would have f[g(t); t] = f[h(t); t], and this would imply 
(g(t) ; t) = (h(t); t), or g(t) = h(t) for every t in T 0 . 

13.5 EXTREMA OF REAL-VALUED FUNCTIONS OF ONE VARIABLE 

In the remainder of this chapter we shall consider real-valued functions / with a 
view toward determining those points (if any) at which /has a local extremum, 
that is, either a local maximum or a local minimum. 
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We have already obtained one result in this connection for functions of one 
variable (Theorem 5.9). In that theorem we found that a necessary condition for a 
function / to have a local extremum at an interior point c of an interval is that 
f'(c ) = 0, provided that /'(c) exists. This condition, however, is not sufficient, as 
we can see by taking /(x) = x 3 , c = 0. We now derive a sufficient condition. 

Theorem 13.8. For some integer n > 1, let f have a continuous /ith derivative in the 
open interval (a, b). Suppose also that for some interior point c in (a, b ) we have 

f’(c) = /"(c) = • • • = fO-'Kc) = 0, but /< B) (c) # 0. 

Then for n even, f has a local minimum at c if / (n> (c) > 0, and a local maximum at 
c if f (n \c) <0. If n is odd, there is neither a local maximum nor a local minimum 
at c. 

Proof. Since / (B) (c) # 0, there exists an interval B(c) such that for every x in B(c), 
the derivative / (n> (x) will have the same sign as / <n) (c). Now by Taylor’s formula 
(Theorem 5.19), for every x in B(c) we have 

f(x) - /(c) = f -* 1 - (x - cf, where e B{c). 

n\ 

If n is even, this equation implies f(x) > /(c) when f (n) {c) > 0, and f(x) < /(c) 
when / (B) (c) < 0. If n is odd and / (B) (c) > 0, then f(x) > /(c) when x > c, but 
f(x) < /(c) when x < c, and there can be no extremum at c. A similar statement 
holds if n is odd and / (B) (c) < 0. This proves the theorem. 

13.6 EXTREMA OF REAL-VALUED FUNCTIONS OF SEVERAL VARIABLES 

We turn now to functions of several variables. Exercise 12.1 gives a necessary 
condition for a function to have a local maximum or a local minimum at an interior 
point a of an open set. The condition is that each partial derivative D k f(si) must 
be zero at that point. We can also state this in terms of directional derivatives by 
saying that /'(a ; u) must be zero for every direction u. 

The converse of this statement is not true, however. Consider the following 
example of a function of two real variables : 

f(x, y) = (y - x 2 )(y - 2x 2 ). 

Here we have D t f(0, 0) = D 2 f(0, 0) = 0. Now /( 0, 0) = 0, but the function 
assumes both positive and negative values in every neighborhood of (0, 0), so 
there is neither a local maximum nor a local minimum at (0, 0). (See Fig. 13.3.) 

This example illustrates another interesting phenomenon. If we take a fixed 
straight line through the origin and restrict the point (x, y) to move along this line 
toward (0, 0), then the point will finally enter the region above the parabola 
y = 2x 2 (or below the parabola y = x 2 ) in which f(x, y) becomes and stays 
positive for every (x, y) # (0, 0). Therefore, along every such line,/has a minimum 
at (0, 0), but the origin is not a local minimum in any two-dimensional neighbor- 
hood of (0, 0). 
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Figure 13.3 


Definition 13.9. If f is differentiable at a and if V/(a) = 0, the point a is called a 
stationary point of f. A stationary point is called a saddle point if every n-ball 5(a) 
contains points x such that f(x) > /(a) and other points such that f(x) < /(a). 

In the foregoing example, the origin is a saddle point of the function. 

To determine whether a function of n variables has a local maximum, a local 
minimum, or a saddle point at a stationary point a, we must determine the algebraic 
sign of f{x) — /(a) for all x in a neighborhood of a. As in the one-dimensional 
case, this is done with the help of Taylor’s formula (Theorem 12.14). Take m = 2 
and y = a + t in Theorem 12.14. If the partial derivatives of / are differentiable 
on an n-ball 5(a) then 

/(a + t) - /(a) = V/(a) • t + i/"(z; t), (3) 

where z lies on the line segment joining a and a + t, and 

/"(z; t)=EE D t jf(z)tit j. 

i = 1 7=1 

i 

At a stationary point we have V/(a) = 0 so (3) becomes 

/(a + t) - /(a) = ±/"(z; t). 

Therefore, as a + t ranges over 5(a), the algebraic sign of /(a + t) — /(a) is 
determined by that of /"( z; t). We can write (3) in the form 

/(a + t) - /(a) = }/"( a; t) + ||t|| 2 £(t), (4) 

where 

||t|| 2 £(t) = i/"( z; t) - }/"(a; t). 

The inequality 

1 n n 

lltll 2 TO I IA,;/(z) - D itJ f( a)| ||t|| 2 , 

2 i= 1 j= 1 

shows that 5(t) -► 0 as t -*■ 0 if the second-order partial derivatives of / are 
continuous at -a. Since ||t|| 2 5(t) tends to zero faster than ||t|| 2 , it seems reasonable 
to expect that the algebraic sign of /(a + t) — /(a) should be determined by that 
of /"(a; t). This is what is proved in the next theorem. 
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Theorem 13.10 (Second-derivative test for extrema). Assume that the second-order 
partial derivatives D ifj f exist in an n-ball B{ a) and are continuous at a, where a is a 
stationary point of f Let 

* n n 

Q(t) = if"( a; t)=EE D u f(*)titj. (5) 

2 i= i j= i 

a) If Q( t) > 0 for all t # 0 ,f has a relative minimum at a. 

b) If Q(t) < 0 for all t # 0, / has a relative maximum at a. 

c) If Q(t) takes both positive and negative values , then f has a saddle point at a. 

Proof The function Q is continuous at each point t in R". Let S = {t:||t|| = 1} 
denote the boundary of the n-ball 5(0; 1). If Q( t) > 0 for all t # 0, then Q(t) is 
positive on S. Since S is compact, Q has a minimum on S (call it m), and m > 0. 
Now Q(ct) = c 2 Q(t) for every real c. Taking c = l/||t|| where t # 0 we see that 
ct e 5 and hence c 2 Q(t) > m, so Q(t) > /n||t|| 2 . Using this in (4) we find 

/(a + t) - /(a) = Q(t) + ||t|| 2 £(t) > m ||t|| 2 + ||t|| 2 5(t). 

Since E(t) -*• 0 as t -*• 0, there is a positive number r such that |5(t)| < \m 

whenever 0 < ||t|| < r. For such t we have 0 < ||t|| 2 |5(t)| < J^m|| t|| 2 , so 

/(a + t) - /(a) > m|| t|| 2 - im||t|| 2 = \m\\t\\ 2 > 0. 

Therefore /has a relative minimum at a, which proves (a). To prove (b) we use a 
similar argument, or simply apply part (a) to — f 

Finally, we prove (c). For each A > 0 we have, from (4), 

/(a + At) - /(a) = 0(At) + A 2 ||t|| 2 £(At) = X 2 {Q{t) + ||t|| 2 5(At)}. 

Suppose Q(t) j* 0 for some t. Since E(y) -*■ 0 as y -» 0, there is a positive r such 
that 

|| t|| 2 E(Xt) < i\Q(t)\ if 0 < A < r. 

Therefore, for each such A the quantity X 2 {Q(t) + ||t|| 2 E(Xt)} has the same sign as 
Q(t). Therefore, if 0 < A < r, the difference /(a + At) — /(a) has the same sign 
as Q(t). Hence, if Q( t) takes both positive and negative values, it follows that / 
has a saddle point at a. 

note. A real-valued function Q defined on R" by an equation of the type 

n n 

GW = 13 L a u x i x j> 

i=i j= i 

where x = (x u . . . , x^) and the a tJ are real is called a quadratic form. The form is 
called symmetric if = a }i for all i and j, positive definite if x # 0 implies 
Q(x) > 0, and negative definite if x # 0 implies Q(x) < 0. 

In general, it is not easy to determine whether a quadratic form is positive or 
negative definite. One criterion, involving eigenvalues, is described in Reference 
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13.1, Theorem 9.5. Another, involving determinants, can be described as follows. 
Let A = det [fly] and let A k denote the determinant of the k x k matrix obtained 
by deleting the last (n — k) rows and columns of [fly]. Also, put A 0 = 1. From 
the theory of quadratic forms it is known that a necessary and sufficient condition 
for a symmetric form to be positive definite is that the n + 1 numbers 
A 0 , Aj, . . . , A„ be positive. The form is negative definite if, and only if, the same 
n + 1 numbers are alternately positive and negative. (See Reference 13.2, pp. 
304-308.) The quadratic form which appears in (5) is symmetric because the 
mixed partials D itJ f{ a) and D j>t f{ a) are equal. Therefore, under the conditions of 
Theorem 13.10, we see that /has a local minimum at a if the ( n + 1) numbers 
A 0 , Aj, . . . , A„ are all positive, and a local maximum if these numbers are 
alternately positive and negative. The case n = 2 can be handled directly and gives 
the following criterion. 

Theorem 13.11. Let f be a real-valued function with continuous second-order partial 
derivatives at a stationary point a in R 2 . Let 

* = D ui m, B = D ia f( a), C = D 2 ' 2 f( a), 

and let 

A = det P = AC - B 2 . 

Then we have: 

d) If A > 0 and A > 0, / has a relative minimum at a. 
ty If A > 0 and A < 0, / has a relative maximum at a. 
c) If A < 0,/ has a saddle point at a. 

Proof In the two-dimensional case we can write the quadratic form in (5) as 
follows : 

Q(x, y) = \{Ax 2 + 2Bxy + Cy 2 }. 

If A ^0, this can also be written as 

Q(x, y) = {(Ax + By) 2 + Ay 2 }. 

Z*J\. 

If A > 0, the expression in brackets is the sum of two squares, so Q(x, y) has the 
same sign as A. Therefore, statements (a) and (b) follow at once from parts (a) 
and (b) of Theorem 13.10. 

If A < 0, the quadratic form is the product of two linear factors. Therefore, 
the set of points (x, y) such that Q(x, y) = 0 consists of two lines in the xy-plane 
intersecting at (0, 0). These lines divide the plane into four regions; Q(x, y) is 
positive in two of these regions and negative in the other two. Therefore / has a 
saddle point at a. 

note. If A = 0, there may be a local maximum, a local minimum , or a saddle 
point at a. 
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13.7 EXTREMUM PROBLEMS WITH SIDE CONDITIONS 

Consider the following type of extremum problem. Suppose that f(x, y, z) 
represents the temperature at the point ( x , y, z) in space and we ask for the maxi- 
mum or minimum value of the temperature on a certain surface. If the equation of 
the surface is given explicitly in the form z = h(x, y), then in the expression 
f(x, y, z) we can replace z by h(x, y) to obtain the temperature on the surface as a 
function of x and y alone, say F(x, y) = f[x, y, h(x, y)]. The problem is then 
reduced to finding the extreme values of F. However, in practice, certain difficulties 
arise. The equation of the surface might be given in an implicit form, say 
g{x, y, z ) = 0, and it may be impossible, in practice, to solve this equation 
explicitly for z in terms of x and y, or even for x or y in terms of the remaining 
variables. The problem might be further complicated by asking for the extreme 
values of the temperature at those points which lie on a given curve in space. Such 
a curve is the intersection of two surfaces, say g t (x, y, z) = 0 and g 2 (x, y, z) = 0. 
If we could solve these two equations simultaneously, say for x and y in terms of z, 
then we could introduce these expressions into / and obtain a new function of 
z alone, whose extrema we would then seek. In general, however, this procedure 
cannot be carried out and a more practicable method must be sought. A very 
elegant and useful method for attacking such problems was developed by Lagrange. 

Lagrange’s method provides a necessary condition for an extremum and can be 
described as follows. Let f(x u . . . , *„) be an expression whose extreme values are 
sought when the variables are restricted by a certain number of side conditions, 
say g t (x u . . . , x n ) = 0, . . . , g m (x u . . . , x„) = 0. We then form the linear 
combination 

<K*1» ' • ■ » %n) ./'(^l* ■ * * » ^b) • • m ^n) ”1“ ”1" > • • • » 

where X lt . . . , X m are m constants. We then differentiate with respect to each 
coordinate and consider the following system of n + m equations: 

• • • j x^) = ^ r — 1, 2, . . . , /2, 

^(^i, • . * , Xn) — 0, ^ 1> 2, • • • , m. 

Lagrange discovered that if the point (x ly ... , x„) is a solution of the extremum 

problem, then it will also satisfy this system of n + m equations. In practice, one 

attempts to solve this system for the n + m “unknowns,” X u ... , X m , and 
x n . The points (x u . . . , x„) so obtained must then be tested to determine 
whether they yield a maximum, a minimum, or neither. The numbers X u . . . , X m , 
which are introduced only to help solve the system for x lt . . . , x„, are known as 
Lagrange's multipliers. One multiplier is introduced for each side condition. 

A complicated analytic criterion exists for distinguishing between maxima and 
minima in such problems. (See, for example, Reference 13.3.) However, this 
criterion is not very useful in practice and in any particular prolem it is usually 
easier to rely on some other means (for example, physical or geometrical consider- 
ations) to make this distinction. 
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The following theorem establishes the validity of Lagrange’s method : 

Theorem 13.12. Let f be a real-valued function such that f e C' on an open set S 
in R". Let g u . . . , g m be m real-valued functions such that g = (g l9 . . . , g m ) e C' 
on S, and assume that m < n. Let X 0 be that subset of S on which g vanishes , that is , 

X 0 = {x:\eS, g(x) = 0}. 

Assume that x 0 e X 0 and assume that there exists an n-ball B(x 0 ) such that f(x) < 
f(x o) for all x in X 0 n B(x 0 ) or such that f(x) > f(x 0 ) for all x in X 0 n B(x 0 ). 
Assume also that the m-rowed determinant det [Dy^ ; ( x o)] # 0. Then there exist 
m real numbers k u . . . , X m such that the following n equations are satisfied: 

m 

D r f(x 0 ) + X k D r g k (x 0 ) = 0 (r = 1,2,..., n). (6) 

k= 1 

note. The n equations in (6) are equivalent to the following vector equation : 

V/( x o) + ^1 V#i( x o) + • • • + A m Vg m (x o) = 0- 

* 

Proof Consider the following system of m linear equations in the m unknowns 

; 1 • 

^ 1 > • • • > • 

m 

XI }. k D r g k (x 0 ) = — D r f(x 0 ) (r = 1, 2, .... m). 

k= 1 

This system has a unique solution since, by hypothesis, the determinant of the 
system is not zero. Therefore, the first m equations in (6) are satisfied. We must 
now verify that for this choice of A 1# . . . , A m , the remaining n — m equations in 
(6) are also satisfied. 

To do this, we apply the implicit function theorem. Since m < n, every point 
x in S can be written in the form x = (x'; t), say, where x' e R” and t e R"'". 
In the remainder of this proof we will write x' for (x L , . . . , x m ) and t for 
(x m+ !,..., x„), so that t k = x m+k . In terms of the vector- valued function 
g = (ffu ■ ■ ■ , 9m), we can now write 

g(xo ; t 0 ) = 0 if x 0 = (xq ; t 0 ). 

Since g e C' on S, and since the determinant det [D^^Xq ; t 0 )] # 0, all the 
conditions of the implicit function theorem are satisfied. Therefore, there exists 
an (n — /n)-dimensional neighborhood T 0 of t 0 and a unique vector-valued 
function h = (h u , h m ), defined on T 0 and having values in R m such that 
h e C on T 0 , h(t 0 ) = Xq, and for every t in T 0 , we have g[h(t); t] = 0. This 
amounts to saying that the system of m equations 

9l(^l, • • • , X n) !!)•••> • * • ) X n) 

can be solved for . . . , x m in terms of x m+ 1 , . . . , x„, giving the solutions in the 
form x r = h r (x m+1 , . . . , x„), r = 1, 2, . . . , m. We shall now substitute these 
expressions for x u . . . , x m into the expression f(x t , . . . , x n ) and also into each 
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expression g p (x u . . . , x„). That is to say, we define a new function Fas follows: 

F(X m + 1> • • ' i ^«) /[^l(^m+li • • • i ^b)> • • • » 1> • • • » %n)> %m+ 1» ■ • ■ » ^n]> 

and we define m new functions G u . . . , G m as follows: 


^pC^m+ 1» • • • > ^n) W 1 (X m+1 , • . . , . . . , h m (X m+ i, . . . , JC„), j, . . . , -fn]. 

More briefly, we can write F(t) = /[H(t)] and G p (t) = ^ p [H(t)], where H(t) = 
(h(t); t). Here t is restricted to lie in the set T 0 . 

Each function G p so defined is identically zero on the set T 0 by the implicit 
function theorem. Therefore, each derivative D r G p is also identically zero on T 0 
and, in particular, D r G p (t 0 ) = 0. But by the chain rule (Eq. 12.20), we can com- 
pute these derivatives as follows : 


D r G p (t 0 ) = ^ D k g p (x 0 )D r H k (to) (r = 1, 2, m). 

k= 1 

But H k (t) = h k ( t) if 1 < k <* m, and H k (t) = x k if m + 1 < k < n. Therefore, 
when m + 1 < k <* n, we have D r H k (t) = 0 if m + r ^ k and D r H m+r ( t) = 1 
for every t. Hence the above set of equations becomes 


m 


S Dkg p (x 0 )DA(t 0 ) + D m+r g p (x 0 ) = 0 


k= 1 


p — 1, 2, ... , m, 
r = 1, 2, . . . , n — m. 



By continuity of h, there is an (« — /n)-ball B(t 0 ) s T 0 such that t e B(t 0 ) 
implies (h(t); t) e B(x 0 ), where B(x 0 ) is the »-ball in the statement of the theorem. 
Hence, t e B(t 0 ) implies (h(t) ; t) e X 0 n B(x 0 ) and therefore, by hypothesis, we 
have either F(t) < F(t 0 ) for all t in B(t 0 ) or else we have F(t) > F(t 0 ) for all t in 
2?(t 0 ). That is, F has a local maximum or a local minimum at the interior point t 0 . 
Each partial derivative D r F(t 0 ) must therefore be zero. If we use the chain rule to 
compute these derivatives, we find 


n 


D r F(t 0 ) = ^ D k f(x 0 )D r H k ( t 0 ) (r = 1, m), 


k= 1 


and hence we can write 


m 


S D k f(x 0 )D r h k (to) + D m+,f(x o) = 0 (r = 1, ...,«- m). (8) 


k= 1 


If we now multiply (7) by A p , sum on p, and add the result to (8), we find 

m r m — « m 

J2 D kf(* o) + 2] A p D k g p (x 0 ) D r h k (t 0 ) + D m+r f(x 0 ) + £ A p D m+r ^ p (x 0 ) = 0, 
*=i|_ P =i J P = i 

for r = 1, . . . , n — m. In the sum over k, the expression in square brackets 
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vanishes because of the way X u . . . , X m were defined. Thus we are left with 

m 

An + r/( X o) + ^pAn + r#p( X o) = 0 (r = 1, 2, . . . , n — m), 

p = 1 

and these are exactly the equations needed to complete the proof. 

note. In attempting the solution of a particular extremum problem by Lagrange’s 
method, it is usually very easy to determine the system of equations (6) but, in 
general, it is not a simple matter to actually solve the system. Special devices can 
often be employed to obtain the extreme values of / directly from (6) without first 
finding the particular points where these extremes are taken on. The following 
example illustrates some of these devices : 

Example. A quadric surface with center at the origin has the equation 

Ax 2 + By 2 + Cz 2 4- 2 Dyz + 2Ezx + 2Fxy = 1. 

Find the lengths of its semi-axes . 

Solution. Let us write C*i, x 2 , x 3 ) instead of (x, y, z), and introduce the quadratic form 

3 3 

tf(x) = EE a ij x i x j> (9) 

j=l i=l 

where x = (x 1 , x 2 , x 3 ) and the a tJ = a ji are chosen so that the equation of the surface 
becomes g(x) = 1. (Hence the quadratic form is symmetric and positive definite.) The 
problem is equivalent to finding the extreme values of /(x) = M 2 = x i + x \ + x\ 
subject to the side condition ^(x) = 0, where ^(x) = q(x) — 1. Using Lagrange’s method, 
we introduce one multiplier and consider the vector equation 

V/(x) + XWq(x) = 0 (10) 

(since Vg = V# ). In this particular case, both f and q are homogeneous functions of 
degree 2 and we can apply Euler’s theorem (see Exercise 12.18) in (10) to obtain 

x • V/(x) + Xx • Wq(x) = 2/(x) + 2Xq(x) = 0. 

Since q(x) = 1 on the surface we find X = -/(x), and (10) becomes 

t V/(x) - V#(x) = 0, (11) 

where / = l//(x). (We cannot have /(x) = 0 in this problem.) The vector equation (11) 
then leads to the following three equations for x u x 2 , x 3 : 

(<*11 ~ t)Xi + <* 12*2 + a 13 x 3 = 0 , 

<* 21*1 + ( a 22 ““ 0*2 + a 23 x 3 = 

<* 31*1 + <* 32*2 + (<*33 - 0*3 = 0 - 

Since x = 0 cannot yield a solution to our problem, the determinant of this system must 
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vanish. That is, we must have 



“ * a 12 

a 13 

a 2l 

a 22 ~ i 

a 23 

a 31 

a 32 

a 33 



( 12 ) 


Equation (12) is called the characteristic equation of the quadratic form in (9). In this case, 
the geometrical nature of the problem assures us that the three roots t l9 t 2 , t 3 of this cubic 
must be real and positive. [Since q(x) is symmetric and positive definite, the general 
theory of quadratic forms also guarantees that the roots of (12) are all real and positive. 

(See Reference 13.1, Theorem 9.5.)] The semi-axes of the quadric surface are /f 1/2 , 

,- 1/2 ,- 1/2 

l 2 > *3 


EXERCISES 


Jacobians 

13.1 Let / be the complex-valued function defined for each complex z ^ 0 by the 
equation f(z) = 1/z. Show that J f (z) = — |z|“ 4 . Show that/is one-to-one and compute 
/- 1 explicitly. 

13.2 Let f = (/i,/ 2 ,/ 3 ) be the vector-valued function defined (for every point (x l9 x 2 , x 3 ) 
in R 3 for which x x + x 2 + x 3 ^ — 1) as follows: 


/*(*i, * 2 , x 3 ) = ^ (k = 1, 2, 3). 

1 + x t + x 2 + x 3 

Show that x 2 , x 3 ) = (1 + x t + x 2 + x 3 )~ 4 . Show that f is one-to-one and 
compute f ~ 1 explicitly. 

13.3 Let f = (f l9 ...,/,) be a vector-valued function defined in R n , suppose f e C" 
on R", and let Jf(x) denote the Jacobian determinant. Let g lf . . . , g n be n real-valued 
functions defined on R 1 and having continuous derivatives g ' l9 . . . , g' n . Let h k (x) = 
/*[0i(*iX • • , &,COL * = 1,2 and put h = (h i9 . . . , hj. Show that 

J h (x) = J t [gx(x x \ . . . , &,(*„) ]0i(*i) • • • g'n(x n ). 

13.4 a) If x(r 9 6) — r cos 0 9 y(r 9 0) = r sin 0 9 show that 

8(x 9 y) = r 

d(r, 0 ) 


b) If x(r 9 0 9 </>) = r cos 0 sin (j), y(r 9 0 9 </>) = r sin 0 sin <t> 9 z — r cos show that 


8(x 9 y 9 z) 
8(r 9 0 9 <*) 


— r 2 sin <j>. 


13.5 a) State conditions on / and g which will ensure that the equations x = f(u 9 v) 9 
y = ff ( u 9 v ) can be solved for u and v in a neighborhood of (x 0 , y 0 )- If the solu- 
tions are u = F(x 9 y), v = G(x, y) 9 and if / = d(J 9 g)l8(u 9 v) 9 show that 

8F = I8g <9F = _ 1 <9/ <9G = _ 1 c^r 8G = 18f 

8x J 8v 9 8y , J 8v 9 8x J 8u 9 8y J 8u 
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b) Compute J and the partial derivatives of F and G at (x 0 , y 0 ) = (1, 1) when 
f(u , v) = u 2 — v 2 , g(u 9 v) = 2uv. 


13.6 Let f and g be related as in Theorem 13.6. Consider the case n = 3 and show that 
we have 



*«.i 

^>1/2 (x) 

£>if 3 (x) 

/Kx)2>i^(y) = 

^i,2 

D 2 f 2 {x) 

D 2 f 3 (jL) 


3 

D 3 f 2 (t) 

D 3 f 3 (x) 


0" = 1, 2, 3), 


where y = f(x) and S t j = 0 or 1 according as i ^ j or i = j. Use this to deduce the 
formula 

D l9l - g(/2,/3> / g(/l,/2,/ 3 ) 

1 1 d(x 2 , x 3 ) d(x u x 2 , x 3 ) 


There are similar expressions for the other eight derivatives D k g t . 

13.7 Let / = u + iv be a complex-valued function satisfying the following conditions: 
u^e C ' and yeC'on the open disk A = {z : \z\ < 1 }; / is continuous on the closed disk 
A = {z : \z\ < 1}; u(x, y) = x and v(x, y) = y whenever x 2 + y 2 = 1; the Jacobian 
J f (z) > 0 if z e A. Let B = f(A) denote the image of A under / and prove that: 

a) If X is an open subset of A, then f(X) is an open subset of B. 

b) B is an open disk of radius 1. 

c) For each point u Q + iv 0 in B, there is only a finite number of points z in A such 
that /(z) = u 0 + iv 0 . 


Extremum problems 

13.8 Find and classify the extreme values (if any) of the functions defined by the following 
equations: 

a) /(*, y) = y 2 + x 2 y + x 4 , 

b) f(x, y) = x 2 + y 2 + x + y + xy, 

c) /(*, y) = (x - l) 4 + (x - y) 4 , 

d) /(*, y) = y 2 - x 3 . 

13.9 Find the shortest distance from the point (0, b) on the y-axis to the parabola 
x 2 — 4y = 0. Solve this problem using Lagrange’s method and also without using 
Lagrange’s method. 

13.10 Solve the following geometric problems by Lagrange’s method: 

a) Find the shortest distance from the point (a l9 a 2 , a 3 ) in R 3 to the plane whose 
equation is b 1 x 1 + b 2 x 2 + b 3 x 3 + b 0 = 0. 

b) Find the point on the line of intersection of the two planes 


and 


a l x l + **2*2 + 03*3 + 0 O = 0 


b 1*1 + b 2 x 2 + b 3 x 3 + b 0 = 0 


which is nearest the origin. 
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13.11 Find the maximum value of a k x k |, if 5 j?=i x l = 1, by using 

a) the Cauchy-Schwarz inequality. 

b) Lagrange’s method. 

13.12 Find the maximum of ( x t x 2 • • • x n ) 2 under the restriction 

x\ + • • • + x 2 = 1. 

Use the result to derive the following inequality, valid for positive real numbers a u ...,a n : 


13.13 If/(x) = x\ + 
to the condition x x + 



Cl\ + • • • + Q n 


n 

+ jc£, x = (x l9 . . . , x n ), show that a local extreme of /, subject 
+ x n = a, is 


13.14 Show that all points (x l9 x 2 , x 3 , x 4 ) where x\ + x\ has a local extremum subject 
to the two side conditions x\ + x\ + x\ = 4, x\ + 2x\ + 3x% = 9, are found among 


(0, 0, ± V3, ± 1), (0, ± 1, +2, 0), (± 1, 0, 0, ± V3), (± 2, ± 3, 0, 0). 


Which of these yield a local maximum and which yield a local minimum? Give reasons 
for your conclusions. 

13.15 Show that the extreme values of f(x l9 x 2 , x 3 ) = x\ + x\ + x 3 , subject to the two 
side conditions 

3 3 

£ £ a u x t x J = * ( a u = aji) 

j=l i= 1 

and 

bl x l + b 2 X 2 + ^3 X 3 = (^1> b 2 , ^3) ^ (0» 0)> 

are ff 1 , / 2 -1 , where and f 2 are the roots of the equation 

bi b 2 b 3 

a n ~~ t a 12 a l3 

a 21 a 22 ~~ t a 23 

a 31 a 32 a 33 “ * b 3 \ 

Show that this is a quadratic equation in t and give a geometric argument to explain why 
the roots t l9 1 2 are real and positive. 

13.16 Let A = det [x t j] and let X* = (x il9 . . . , x in ). A famous theorem of Hadamard 
states that |A| < d t • • • d n , if d l9 . . . , d n are n positive constants such that \\X t || 2 = d 2 
(i = 1, 2, . . . , n). Prove this by treating A as a function of n 2 variables subject to n 
constraints, using Lagrange’s method to show that, when A has an extreme under these 
conditions, we must have 



dl 0 0 ••• 0 

0 d 2 0 • • • 0 

• • • • 

• • • • 

• • • • 

0 0 0 ••• d\ 
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CHAPTER 14 


MULTIPLE RIEMANN INTEGRALS 


14.1 INTRODUCTION 

The Riemann integral f(x) dx can be generalized by replacing the interval [a, £>] 
by an n-dimensional region in which f is defined and bounded. The simplest 
regions in R" suitable for this purpose are n-dimensional intervals. For example, 
in R 2 we take a rectangle / partitioned into subrectangles I k and consider Riemann 
sums of the form £ f(x k , y k )A(I k ), where (**, y k ) e I k and A(I k ) denotes the area of 
I k . This leads us to the concept of a double integral. Similarly, in R 3 we use 
rectangular parallelepipeds subdivided into smaller parallelepipeds I k and, by 
considering sums of the form Y.f(x k , y k , z k )V(I k ), where (x k , y k , z k ) e I k and V(I k ) 
is the volume of I k , we are led to the concept of a triple integral. It is just as easy 
to discuss multiple integrals in R", provided that we have a suitable generalization 
of the notions of area and volume. This “generalized volume” is called measure or 
content and is defined in the next section. 


14.2 THE MEASURE OF A BOUNDED INTERVAL IN R" 

Let A j,..., A n denote n general intervals in R 1 ; that is, each A k may be bounded, 
unbounded, open, closed, or half-open in R 1 . A set A in R" of the form 

A = A t x • • • x A n = {(*!, . . . , x„) : x k e A k for k = 1, 2, . . . , n}, 

is called a general n-dimensional interval. We also allow the degenerate case in 
which one or more of the intervals A k consists of a single point. 

If each A k is open, closed, or bounded in R 1 , then A has the corresponding 
property in R". 

If each A k is bounded, the n-dimensional measure (or n-measure) of A, denoted 
by fi(A), is defined by the equation 

H(A) = niAJ ■ • ' y(A n ), 

where n(A k ) is the one-dimensional measure (length) of A k . When n = 2, this is 
called the area of A, and when n = 3, it is called the volume of A. Note that 
H(A ) = 0 if n(A k ) = 0 for some k. 

We turn next to a discussion of Riemann integration in R". The only essential 
difference between the case n = 1 and the case n > 1 is that the quantity 
Ax k = x k — x k _ j which was used to measure the length of the subinterval 
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**] is replaced by the measure p(I k ) of an ^-dimensional subinterval. Since 
the work proceeds on exactly the same lines as the one-dimensional case, we shall 
omit many of the details in the discussions that follow. 


14.3 THE RIEMANN INTEGRAL OF A BOUNDED FUNCTION DEFINED 
ON A COMPACT INTERVAL IN R" 

Definition 14.1 . Let A = A t x • • • x A n be a compact interval in R". If P k is a 
partition of A k , the cartesian product 

P = P t x • • • x P n , 

is said to be a partition of A, If P k divides A k into m k one-dimensional subintervals , 
then P determines a decomposition of A as a union of m t • • • m n n-dimensional 
intervals ( called subintervals of P). A partition P' of A is said to be finer than P if 
P £ P\ The set of all partitions of A will be denoted by SP{A). 

Figure 14.1 illustrates partitions of intervals in R 2 and in R 3 . 




Figure 14.1 


Definition 14.2. Let f be defined and bounded on a compact interval I in R". If P 
is a partition of I into m subintervals I u . . . , I m and if t k e /*, a sum of the form 

m 

s(P,f) = z /(t*)MU 

fc= 1 

is called a Riemann sum . We say f is Riemann-integrable on I and we write f e Ron 
/, whenever there exists a real number A having the following property: For every 
e > 0 there exists a partition P e of I such that P finer than P e implies 

\S(P,f) - A\ < 6, 

for all Riemann sums S(P, /). When such a number A exists , it is uniquely 
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determined and is denoted by 



m 

f(x) dx. 


or by 



..,x„) d(x u 



note. For n > 1 the integral is called a multiple or n-fold integral. When n = 2 
and 3, the terms double and triple integral are used. As in R 1 , the symbol x in 
j//(x) dx is a “dummy variable” and may be replaced by any other convenient 
symbol. The notation . . . , x„) dx v • • • dx H is also used instead of 

j//(*i> • ••,*,,) d(x u . . . , x„). Double integrals are sometimes written with two 
integral signs and triple integrals with three such signs, thus : 



fix, y) dx dy. 



f(x, y, z) dx dy dz. 


Definition 14.3. Let f be defined and bounded on a compact interval I in R". If P 
is a partition of I into m subintervals I u . . . , I m , let 

m k (f) = inf {/(x) : x e I k }, M k (f) = sup {fix) : x e I k }. 

The numbers 


m m 

UiP,f) = 2 M kif)Kh) and L(P,f) = £ "ikifttihX 

k = 1 k=l 

are called upper and lower Riemann sums . The upper and lower Riemann integrals 
of f over I are defined as follows: 

f fdx = inf{l/(P,/):Pe^(/)}, 
j f dx = sup (L(P,/) : P e ^(/)}. 

The function f is said to satisfy Riemann' s condition on I if, for every e > 0, there 
exists a partition P t of I such that P finer than P t implies UiP,f) — L{P, f) < e. 

note. As in the one-dimensional case, upper and lower integrals have the following 
properties : 


1 

1 


if + g)dx 
if + g) dx 


< 


> 



+ 

+ 



g dx. 


g dx. 


a) 
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b) If an interval / is decomposed into a union of two nonoverlapping intervals 
I u 1 2 , then we have 

j f dx = j fdx + |* fdx and f f dx = |* f dx + f f dx. 

i Ji i Jii J / Ji 1 J/2 

The proof of the following theorem is essentially the same as that of Theorem 
7.19 and will be omitted. 

Theorem 14.4. Let f be defined and bounded on a compact interval I in R". Then 
the following statements are equivalent: 

i) fe R on I. 

ii) / satisfies Riemann' s condition on I. 

iii) [if dx = J t f dx. 


14.4 SETS OF MEASURE ZERO AND LEBESGUE’S CRITERION FOR 
EXISTENCE OF A MULTIPLE RIEMANN INTEGRAL 

A subset T of R" is said to be of n-measure zero if, for every e > 0, T can be 
covered by a countable collection of n-dimensional intervals, the sum of whose 
M-measures is <e. 

As in the one-dimensional case, the union of a countable collection of sets of 
n-measure 0 is itself of n-measure 0. If m < n, every subset of R m , when considered 
as a subset of R", has ^-measure 0. 

A property is said to hold almost everywhere on a set S in R" if it holds every- 
where on S except for a subset of ra-measure 0. 

Lebesgue’s criterion for the existence of a Riemann integral in R 1 has a 
straightforward extension to multiple integrals. The proof is analogous to that of 
Theorem 7.48. 

Theorem 14.5. Let f be defined and bounded on a compact interval I in R". Then 
f e R on I if and only if, the set of discontinuities of f in I has n-measure zero. 

14.5 EVALUATION OF A MULTIPLE INTEGRAL BY ITERATED 
INTEGRATION 

From elementary calculus the reader has learned to evaluate certain double and 
triple integrals by successive integration with respect to each variable.. For 
example, if/ is a function of two variables continuous on a compact rectangle Q 
in the xy-plane, say Q = {(*, y) : a < x < b, c < y < d}, then for each fixed y 
in [c, d~\ the function F defined by the equation F(x) = f(x, y) is continuous (and 
hence integrable) on [a, b\ The value of the integral J* F(x) dx depends on y and 
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defines a new function G, where G(y) = J„/(x, y) dx. This function G is con- 
tinuous (by Theorem 7.38), and hence integrable, on [ c , d~\. The integral j* G(y) dy 
turns out to have the same value as the double integral jg/(x, y) d(x, y). That is, 
we have the equation 


fix, y) d(x, y) 
a 




(This formula will be proved later.) The question now arises as to whether a 
similar result holds when /is merely integrable (and not necessarily continuous) on 
Q. We can see at once that certain difficulties are inevitable. For example, the 
inner integral f(x, y) dx may not exist for certain values of y even though the 
double integral exists. In fact, if / is discontinuous at every point of the line 
segment y = y 0 , a <, x <, b, then JJJ f(x, y 0 ) dx will fail to exist. However, this 
line segment is a set whose 2-measure is zero and therefore does not affect the 
integrability of / on the whole rectangle Q. In a case of this kind we must use 
upper and lower integrals to obtain a suitable generalization of (1). 


Theorem 14.6. Let f be defined and bounded on a compact rectangle 

Q = [a, b] x [c, cT\ in R 2 . 

Then we have: 

0 y) < Ja \_Ycfix, y) dy] dx < [^/(x, y) dy] dx < j Q fd(x, y). 

ii) Statement (i) holds with J* replaced by \ d c throughout . 

iii) j Q fd{x, y) < £ [Jj/(x, y) dx] dy < j d c [J b a f(x, y) dx] dy < ] Q f d{x, y). 

iv) Statement (iii) holds with replaced by J* throughout. 

v) When J Q /(x, y) d(x, y) exists, we have 

J fix, y) d(x, y) = /: [j: /(x, y) dy] dx = |* f J* /(x, y) dy] dx 


-m 


f(x, y) dx J dy = m: fix, y) dx 1 dy. 


Proof. To prove (i), define F by the equation 


Fix) = 


% d 

= fix, 
■ • C 


y) dy, if x e [o, b]. 


Then |F(x)| < Mid — c), where M = sup {|/(x, y)| : (x, y) e Q), and we can 
consider 


= j* Fix) dx = f[f: fix, y) dyj dx. 
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Similarly, we define 


*b 

I = F(x) dx = 



dx. 


Let P x = { x 0 , x u . . . , x n } be a partition of [a, ti] and let 


P2 = {y 0 » yu ■ ■ ■ , y m }, 

be a partition of [e, </]. Then P = P l x P 2 is a partition of £? into mn sub- 
rectangles Q tJ and we define 

h = f ‘ r r /(x, jo dy] dx, = p r p j ^ y> ^1 

J*i-i LJyj-i J Jxi-i L.Jyj-1 J 


Since we have 


i> 


y) dy = 


^ fyj 

E fix, 

J =1 Jyj - 1 


y) dy. 


we can write 


I If ^ X ’ ^ dx ~ ^ J ^ /(*» jO dyj dx 


-t Ef [f /(x, y) dyl dx 
j =i ‘ =i J*i_, LJ^-i 


That is, we have the inequality 


m n 


j=i i= 1 


Similarly, we find 


m n 


If we write 


and 


' a E E 

/=i ;=i 


m ij = inf {/(x, y) : (x, y) e Q i} }, 
M,j = sup {/(x, y) : (x, y) e Q u }, 


then from the inequality < /(x, y) ^ M tj , (x, y) e 0 (j , we obtain 

"iij(y; - y,-i) ^ fix, y) dy < Mijiyj - y^). 

Jyj-i 
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This, in turn, implies 


m ijKQij) ^ j I J* f(x, y ) dy~\ dx 

J*i-i LJjv-i J 


p rp nx, 

Jxt-t LJyj-i 


y) <2yl dx < MijUiQij). 


Summing on i and j and using the above inequalities, we get 

UP,/) <.1^1 <. U(P,f). 

Since this holds for all partitions P of Q, we must have 

j fd(x, y) <. I < 1 < f fd(x, y). 

Jq Jq 

This proves statement (i). 

It is clear that the preceding proof could also be carried out if the function F 
were originally defined by the formula 


F{x) 


= f 


fix, y) dy. 


and hence (ii) follows by the same argument. 

Statements (iii) and (iv) can be similarly proved by interchanging the roles of 
x and y. Finally, statement (v) is an immediate consequence of statements (i) 
through (iv). 

As a corollary, we have the formula mentioned earlier : 

J f(x, y) dix, y) = rtr fix, y) dyj dx = f[f fix, y) rfxl dy, 

which is valid when / is continuous on Q. This is often called Fubini’s theorem. 
note. The existence of the iterated integrals 


m: 


; 


/(x, y) dy dx and 


j* PJ* fix, y) dxj dy, 


does not imply the existence of fix, y) dix, y). A counter example is given in 
Exercise 14.7. 


Before commenting on the analog of Theorem 14.6 in R", we first introduce 
some further notation and terminology. If k <, n, the set of x in R". for which 
x k = 0 is called the coordinate hyperplane Jit- Given a set S in R", the projection 
S k of S on Ilk- is defined to be the image of S under that mapping whose value at 
each point (x x , x 2 , . . . , xj) in S is (x t , . . . , x*_i, 0, x k+ u ... , x„). It is easy to 
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Figure 14.2 


show that such a mapping is continuous on S'. It follows that if S is compact, each 
projection S k is compact. Also, if S is connected, each S k is connected. Projections 
in R 3 are illustrated in Fig. 14.2. 

A theorem entirely analogous to Theorem 14.6 holds for w-fold integrals. It 
will suffice to indicate how the extension goes when n = 3. In this case, /is defined 
and bounded on a compact interval Q = [a l9 b{\ x [ a 2 , Z> 2 ] x {.<*$> £ 3 ] in R 3 
and statement (i) of Theorem 14.6 is replaced by 

f fdx<[ |"f / d(x 2 , x 3 )l dx x 

Jc J®i LJqi j 

■ I r i* ^ d (* 2 ’ dxi ~ f ? ^ w 

Ja 1 LJQi J JQ 

where Q x is the projection of Q on the coordinate plane fL- When J c /(x) dx 
exists, the analog of part (v) of Theorem 14.6 is the formula 


I /(x) dx = j IT fd{x 2 , x 3 )l dx, = f r f fdx 1 \d(x 2 ,x 3 ). (3) 

i LJci J Jq i LJ«i J 

As in Theorem 14.6, similar statements hold with appropriate replacements of 
upper integrals by lower integrals, and there are also analogous formulas for the 
projections Q 2 and Q 3 . 

The reader should have no difficulty in stating analogous results for ra-fold 
integrals (they can be proved by the method used in Theorem 14.6). The special 
case in which the n-fold integral J G /(x) dx exists is of particular importance and 
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can be stated as follows : 

Theorem 14.7. Let f be defined and bounded on a compact interval 

Q = [«i, *i] x • • • x [a n , *„], 


in R". Assume that /(x) dx exists. Then 

f dx i I d(x 2 , • • • j 

Q Jai LJQi J JQi LJ«1 J 

Similar formulas hold with upper integrals replaced by lower integrals and with Q t 
replaced by Q k , the projection of Q on n*- 


f fdx = r r r f<Kx 2 ,...,xS\dx l = f rr 

Jo J«i Uqi J Joi U«. 


14.6 JORDAN-MEASURABLE SETS IN R" 

Up to this point the multiple integral /(x) dx has been defined only for intervals 
/. This, of course, is too restrictive for the applications of integration. It is not 
difficult to extend the definition to encompass more general sets called Jordan- 
measurable sets. These are discussed in this section. The definition makes use of 
the boundary of a set S in R". We recall that a point x in R" is called a boundary 
point of S if every n-ball B(x) contains a point in S and also a point not in S. The 
set of all boundary points of S is called the boundary of S and is denoted by dS. 
(See Section 3.16.) 

Definition 14.8. Let S be a subset of a compact interval I in R". For every partition 
P of I define J(P, S) to be the sum of the measures of those subintervals of P which 
contain only interior points of S and let J(P, S ) be the sum of the measures of those 
subintervals of P which contain points of S u dS. The numbers 

c(S) = sup {J(P, S) :Pe &(I)}, 

c(S) = inf {J(P, S) :Pe &(I)}, 

are called, respectively, the (n-dimensional) inner and outer Jordan content of S. 
The set S is said to be Jordan-measurable if c(S) = c(S), in which case this common 
value is called the Jordan content of S, denoted by c(S). 

It is easy to verify that c(S) and c(S) depend only on S and not on the interval 
/ which contains S. Also, 0 <; c(S) < c(S). 

If S has content zero, then c(S) = c(S) = 0. Hence, for every e > 0, S can be 
covered by a finite collection of intervals, the sum of whose measures is <e. Note 
that content zero is described in terms of finite coverings, whereas measure zero is 
described in terms of countable coverings. Any set with content zero also has 
measure zero, but the converse is not necessarily true. 

Every compact interval Q is Jordan-measurable and its content, c(Q), is equal to 
its measure, p(Q). If k < n,' the n-dimensional content of every bounded set in R‘ 
is zero. 

Jordan-measurable sets S in R 2 are also said to have area c(S). In this case, the 
sums J(P, S ) and J(P, S ) represent approximations to the area from the “inside” 
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and the “outside” of S, respectively. This is illustrated in Fig. 14.3, where the lightly 
shaded rectangles are counted in J(P, S ), the heavily shaded rectangles in J(P, S ). 
For sets in R 3 , c(S) is also called the volume of S. 

The next theorem shows that a bounded set has Jordan content if, and only if, 
its boundary isn’t too “thick.” 

Theorem 14.9. Let S be a bounded set in R” and let dS denote its boundary. Then 
we have 

c(dS) = c{S) - c(S). 

Hence, S is Jordan-measurable if, and only if, dS has content zero. 

Proof. Let / be a compact interval containing S and dS. Then for every partition 
P of I we have 

J(P, 8S) = J(P, S ) - J(P, S ). 

Therefore, J(P, 8S) > c(S) — c(S ) and hence c(dS) > c(S) — c(S). To obtain 
the reverse inequality, let e > 0 be given, choose P x so that J(P U S) < c(S ) + e/2 
and choose P 2 so that J(P 2 , S) > c(S) - e/2. Let P = P t u P 2 . Since refine- 
ment increases the inner sums J and decreases the outer sums J, we find 

c(8S) £ J(P, dS) = J(P, S ) - J(P, S) <, J(P U S) - J(P 2 , S) 

< c(S ) — c(S) + e. 

Since e is arbitrary, this means that c(dS) <, c(S) — c(S). Therefore, c(8S) = 
c(S) — c(S) and the proof is complete. 


14.7 MULTIPLE INTEGRATION OVER JORDAN-MEASURABLE SETS 

Definition 14.10. Let f be defined and bounded on a bounded Jordan-measurable set 
S in R”. Let I be a compact interval containing S and define g on I as follows: 



m 

0 


if xe S, 
ifxel — S. 
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Then f is said to be Riemam-integrable on S and we write f e Ron S, whenever the 
integral g(x) dx exists. We also write 

j* /(x) dx = f g(x) dx. 

The upper and lower integrals J s /(x) dx and Js/(x) dx are similarly defined. 

note. By considering the Riemann sums which approximate J t g(x) dx, it is easy 
to see that the integral J$/(x) dx does not depend on the choice of the interval / 
used to enclose S. 

A necessary and sufficient condition for the existence of Js/( x ) can now be 
given. 

Theorem 14.11. Let S be a Jordan-measurable set in R", and let f be defined and 
bounded on S. Then f e Ron S if, and only if, the discontinuities of fin S form a 
set of measure zero. 

Proof. Let / be a compact interval containing S and let <?(x) = /(x) when x e S, 
g(x) = 0 when x e / — S. The discontinuities of / will be discontinuities of g. 
However, g may also have discontinuities at some or all of the boundary points of 
S. Since S is Jordan measurable, Theorem 14.9 tells us that c(dS) = 0. Therefore, 
g e R on / if, and only if, the discontinuities of / form a set of measure zero. 


14.8 JORDAN CONTENT EXPRESSED AS A RIEMANN INTEGRAL 


Theorem 14.12. Let S be a compact Jordan-measurable set in R". Then the integral 
J s 1 exists and we have 

c(S) = f 1. 

Js 


Proof. Let / be a compact interval containing S and let Xs denote the characteristic 
function of S. That is. 



if x e S, 
if x e I — S. 


The discontinuities of Xs in I are the boundary points of 5 and these form a set 
of content zero, so the integral J j xs exists, and hence J s 1 exists. 

Let P be a partition of I into subintervals I u ■ ■ ., I m , and let 


If k e A, we have 


A = {k : I k n 5 is nonempty}. 


M k (Xs) = sup {*s(x) : x e /*} = 1, 
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and M k (xs) = 0 if k $ A, so 


m 


U(P, Xs) = E MkiXsMh) = Z = HP, Xs)- 

k=l 


he A 


Since this holds for all partitions, we have J f Xs = c(S) = c(S). But 


I* 'I 


Xs so c(S) 


= [ x, » [ i- 

JI JS 


14.9 ADDITIVE PROPERTY OF THE RIEMANN INTEGRAL 

The next theorem shows that the integral is additive with respect to sets having 
Jordan content. 

Theorem 14.13. Assume feR on a Jordan-measurable set S in R". Suppose 
S = A u B, where A and B are Jordan-measurable but have no interior points in 
common. Then f e R on A,f e R on B, and we have 

I /(*) dx = f /(x) dx + j* /(x) dx. (4) 

Js Ja Jb 

Proof. Let I be a compact interval containing S and define g as follows : 



if x e S, 
if x e I — S. 


The existence of J^/(x) dx and J B /(x) dx is an easy consequence of Theorem 
14.1 1. To prove (4), let P be a partition of I into m subintervals I u . . . , I m and 
form a Riemann sum 


m 

S(P, g) = E 9(tMh)- 
*= 1 

If S A denotes that part of the sum arising from those subintervals containing 
points of A, and if S B is similarly defined, we can write 

S(P , g) = S A + Sg S& 

where S c contains those terms coming from subintervals which contain both points 
of A and points of B. In particular, all points common to the two boundaries dA 
and dB will fall in this third class. But now S A is a Riemann sum approximating 
the integral /(x) dx, and S B is a Riemann sum approximating J B /(x) dx. Since 
c(dA n SB) = 0, it follows that |S C | can be made arbitrarily small when P is 
sufficiently fine. The equation in the theorem is an easy consequence of these 
remarks. 

note. Formula (4) also holds for upper and lower integrals. 
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For sets S whose structure is relatively simple, Theorem 14.6 can be used to 
obtain formulas for evaluating double integrals by iterated integration. These 
formulas are given in the next theorem. 

Theorem 14.14. Let (j>i and <j> 2 be two continuous functions defined on [a, 6] such 
that 4>i(x) ^ $ 2 ix) f° r eac b x in [a, b\ Let S be the compact set in R 2 given by 

S = {(x, y) : a < x < b, ^(x) < y < <l> 2 (x)}. 

Iff e R on S, we have 


( • c b r 

f(x, y) d(x, y) = f(x, y) dy dx. 

S Ja L4i(*) J 

note. The set S is Jordan-measurable because its boundary has content zero. 
(See Exercise 14.9.) 

Analogous statements hold for n-fold integrals. The extensions are too obvious 
to require further comment. 



Figure 14.4 


Figure 14.4 illustrates the type of region described in the theorem. For sets 
which can be decomposed into a finite number of Jordan-measurable regions of 
this type, we can apply iterated integration to each separate part and add the results 
in accordance with Theorem 14.13; 

14.10 MEAN-VALUE THEOREM FOR MULTIPLE INTEGRALS 

As in the one-dimensional case, multiple integrals satisfy a mean value property. 
This can be obtained as an easy consequence of the following theorem, the proof 
of which is left as an exercise. 

Theorem 14.15. Assume f e R and g e R on a Jordan-measurable set S in R". If 
fix) < gix) for each x in S, then we have 

j fix) dx < | gix) dx. 

Js J s 
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Theorem 14.16 ( Mean - Value Theorem for multiple integrals). Assume that g e R 
and f e R on a Jordan-measurable set S in R" and suppose that g(x) > 0 for each 
x in S. Let m = inf f(S), M — sup f(S). Then there exists a real number X in the 
interval m < X < M such that 


In particular, we have 



f(x)g(x) dx = X 



g(x) dx. 


mc(S) < 


/(x) dx < Mc(S). 
s 


( 5 ) 

( 6 ) 


note. If, in addition, S is connected and / is continuous on S, then X = /(x 0 ) for 
some x 0 in S (by Theorem 4.38.) and (5) becomes 


f f(x)g(x) dx = /(x 0 ) 


f g(x) dx. 




s 




s 


In particular, (7) implies J s /(x) dx = f(x 0 )c(S), where x 0 e S. 



Proof. Since g(x) > 0, we have mg(x) < f(x) g(x) < Mg(x) for each x in S. By 
Theorem 14.15, we can write 


m g(x) dx < f(x)g(x) dx < M ^(x) dx. 

Js Js Js 

If J s g( x ) dx = 0, (5) holds for every X. If J - s g(x) dx > 0, (5) holds with 
^ = $sf( x )g( x ) dx /$s g( x ) dx. Taking ^(x) = 1 , we obtain (6). 

We can use (6) to prove that the integrand / can be disturbed on a set of content 
zero without affecting the value of the integral. In fact, we have the following 
theorem : 


Theorem 14.17. Assume that f e Ron a Jordan-measurable set S in R". Let T be a 
subset of S having n-dimensional Jordan content zero. Let g be a function, defined 
and bounded on S, such that g(x) = f(x) when xe S — T. Then g e Ron S and 

J /(x) dx = f g(x) dx. 

Js Js 

Proof. Let h = f — g. Then J s h(x) dx = J r h(x) dx + J s _ r h(x) dx. However, 
J r h(x) dx = 0 because of (6), and J s _ r h(x) dx = 0 since h(x) = 0 for each 
x in S — T. 

note. This theorem suggests a way of extending the definition of the Riemann 
integral J s /(x) dx for functions which may not be defined and bounded on the 
whole of S. In fact, let S be a bounded set in R" having Jordan content and let T 
be a subset of S having content zero. If / is defined and bounded on S — T and 
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if J s _ r /(x) dx exists, we agree to write 


I 


f(x) dx 


L 


f{x) dx. 


and to say that / is Riemann-integrable on S. In view of the theorem just proved, 
this is essentially the same as extending the domain of definition of f to the whole 
of S by defining / on T in such a way that it remains bounded. 


EXERCISES 


Multiple integrals 

14.1 If/i 6 R on [a u f n e R on [a n , b n ], prove that 

^ /l(*i) * * ‘fniXn) d{x t , . . ., X n ) fli^l) dXi )•••(£ Znfan) dx n | , 

where S = [a l9 b x ] x • x [a n , b n ]. 

14.2 Let /be defined and bounded on a compact rectangle Q = [a, b] x [c, d] in R 2 . 
Assume that for each fixed y in [c, d],f(x, y) is an increasing function of x, and that for 
each fixed x in [a, b], fix, y) is an increasing function of y. Prove that fe R on Q. 

14.3 Evaluate each of the following double integrals. 


If" 


a) | | sin 2 x sin 2 y dx dy , where Q = [0, n] x [0, n]. 

Q 

b) |cos (x + y) | dx dy , where Q = [0, n ] x [0, n\ 

Q 

c) [x + y] dx dy, where Q = [0, 2] x [0, 2], and [t] is the greatest 

Q 

integer < t. 

14.4 Let Q = [0, 1] x [0, 1] and calculate SS Q f(x, y) dx dy in each case. 

a) fix, y) = 1 — x — y if x + y < 1, fix, y) = 0 otherwise. 

b) fix, y) = x 2 + y 2 if x 2 + y 2 < 1, fix, y) = 0 otherwise. 

c) fix, y) = x + y if x 2 < y < 2x 2 , fix, y) = 0 otherwise. 

14.5 Define / on the square Q = [0, 1 ] x [0, 1 ] as follows: 

1 if x is rational, 

2 y if x is irrational. 

a) Prove that Jo fi x > y) dy exists for 0 < f < 1 and that 

y) dy\ dx = t 2 . 


fix, y) =' 


(* f fix, y) dy 
Jo LJo 
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and 



/. 


This shows that Jj [Jo/(*> y) dy] dx exists and equals 1. 

b) Prove that Jj [Jo /(*, y) dx] dy exists and find its value. 

c) Prove that the double integral f Q f(x 9 y) d(x, y) does not exist. 

14.6 Define/on the square Q = [0, 1] x [0, 1] as follows: 


f(x> y) = 


0 if at least one of x , y is irrational, 

1 In if y is rational and x = m/n 9 

where m and n are relatively prime integers, n > 0. Prove that 

J /(*, y) dx = J £ J f(x 9 y) dx j dy = j f(x 9 y) d(x 9 y) 

but that /J f(x 9 y) dy does not exist for rational x. 

14.7 If p k denotes the &th prime number, let 


= 0 


S(p k )= = 1,2 ,...,p k - 1, 

\\Pk Pk/ 

let S = (J"=i s (Pk), and let Q = [0, 1 ] x [0, 1 ]. 

a) Prove that S is dense in Q (that is, the closure of S contains Q) but that any line 
parallel to the coordinate axes contains at most a finite subset of 5. 

b) Define /on Q as follows: 

f(x 9 y) = 0 if (x 9 y) e S 9 f(x 9 y) = 1 if ( x 9 y) e Q - S. 

Prove that Jo [Jo/(*> y) dy] dx = Jj [Jo/(*. y) dx] dy = 1, but that the 
double integral /<*/(*, y) d(x 9 y) does not exist. 


m — 1, 2, . . . , p k — 1 


Jordan content 

14.8 Let S be a bounded set in R" having at most a finite number of accumulation points. 
Prove that c(S) = 0. 

14.9 Let / be a continuous real-valued function defined on [a 9 b]. Let S denote the 
graph of/, that is, S = {(*, y) : y = f(x) 9 a < x < b}. Prove that S has two-dimensional 
Jordan content zero. 

14.10 Let r be a rectifiable curve in R". Prove that T has /i-dimensional Jordan content 
zero. 

14.11 Let /be a nonnegative function defined on a set S in R". The ordinate set of /over 
S is defined to be the following subset of R" +1 : 


{(^1, . . . , X n9 JCn + l) * (^1> • • • s G 5", 0 < X n +i < fix i, . . . , Jf,,)}. 

If S is a Jordan-measurable region in R" and if /is continuous on S 9 prove that the ordinate 
set of /over S has (n + l)-dimensional Jordan content whose value is 

^ fix i, • • • 9 Xf ^ dix j, • • • , Xj^). 

Interpret this problem geometrically when n = 1 and n = 2. 
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14.12 Assume that f e R on S and suppose SsfOO dx = 0- (S is a subset of R n ). Let 
A = {x : x e $,/(x) < 0} and assume that c(A) = 0. Prove that there exists a set B of 
measure zero such that /(x) = 0 for each x in S — B. 

14.13 Assume that f e R on S 9 where S is a region in R n and /is continuous on S. Prove 
that there exists an interior point x 0 of S such that 



dx = f(x 0 )c(S). 


14.14 Let /be continuous on a rectangle Q = [a, 6] x [c, d]. For each interior point 
(x lt x 2 ) in Q, define 


F(x u x 2 ) 




dx. 


Prove that D 12 F(x l9 x 2 ) = D 21 F(x l9 x 2 ) = f(x l9 x 2 ). 

14.15 Let T denote the following triangular region in the plane: 


T — |(;t, y): 0 < - + ^ < l|, where a > 0, b > 0. 

Assume that /has a continuous second-order partial derivative D 12 f on T. Prove that 
there is a point (x 09 y 0 ) on the segment joining (a 9 0) and (0, b) such that 



D u2 f(x, y) d(x, y ) = /( 0, 0) - f(a, 0 )+ aDJ{x 0 , y 0 ). 
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CHAPTER 15 


MULTIPLE LEBESGUE INTEGRALS 


15.1 INTRODUCTION 

The Lebesgue integral was described in Chapter 10 for functions defined on subsets 
of R 1 . The method used there can be generalized to provide a theory of Lebesgue 
integration for functions defined on subsets of n-dimensional space R". The 
resulting integrals are called multiple integrals. When n = 2 they are called double 
integrals, and when n = 3 they are called triple integrals. 

As in the one-dimensional case, multiple Lebesgue integration is an extension 
of multiple Riemann integration. It permits more general functions as integrands, 
it treats unbounded as well as bounded functions, and it encompasses more 
general sets as regions of integration. 

The basic definitions and the principal convergence theorems are completely 
analogous to the one-dimensional case. However, there is one new feature that 
does not appear in R 1 . A multiple integral in R" can be evaluated by calculating 
a succession of n one-dimensional integrals. This result, called Fubini's Theorem, 
is one of the principal concerns of this chapter. 

As in the one-dimensional case we define the integral first for step functions, 
then for a larger class (called upper functions) which contains limits of cer tain 
increasing sequences of step functions, and finally for an even larger class, the 
Lebesgue-integrable functions. Since the development proceeds on exactly the 

same lines as in the one-dimensional case, we shall omit most of the details of 
the proofs. 

We recall some of the concepts introduced in Chapter 14. If / = I t x ■ • ■ x /„ 
is a bounded interval in R", the n-measure of / is defined by the equation 

tiD = tih) • • • tiO, 

where p(I k ) is the one-dimensional measure, or length, of I k . 

A subset T of R" is said to be of n-measure 0 if, for every e > 0, T can be 
covered by a countable collection of n-dimensional intervals, the sum of whose 
n-measures is <e. 

A property is said to hold almost everywhere on a set S in R" if it holds every- 
where on S except for a subset of n-measure 0. For example, if {/„} is a sequence 
of functions, we say/, -► / almost everywhere on S if lim^^ f n (x) = /( x ) for all 
x in S except for those x in a subset of n-measure 0. 
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15.2 STEP FUNCTIONS AND THEIR INTEGRALS 
Let / be a compact interval in R”, say 

I = h x • • • x /„, 

where each I k is a compact subinterval of R 1 . If P k is a partition of I k , the cartesian 
product P = P { x • • • x P n is called a partition of /. If P k decomposes I k into 
m k one-dimensional subintervals, then P decomposes / into m = m x • •• m k 
n-dimensional subintervals, say J u ... ,J m . 

A function s defined on / is called a step function if a partition P of /exists such 
that s is constant on the interior of each subinterval J k , say 

six) = c k if x e int J k . 

The integral of s over / is defined by the equation 

S = c kP(Jk)- (!) 

Ji *=i 

Now let G be a general n-dimensional interval, that is, an interval in R" which 
need not be compact. A function s is called a step function on G if there is a 
compact n-dimensional subinterval I of G such that s is a step function on I and 
s(x) = 0 if x e G — I. The integral of s over G is defined by the formula 

I s = I* 

where the integral over /is given by (1). As in the one-dimensional case the integral 
is independent of the choice of I. 

153 UPPER FUNCTIONS AND LEBESGUE-INTEGRABLE FUNCTIONS 

Upper functions and Lebesgue-integrable functions are defined exactly as in the 
one-dimensional case. 

A real-valued function / defined on an interval / in R" is called an upper function 
on /, and we write f e U(I), if there exists an increasing sequence of step functions 
{$„} such that 

a) s„ -*■ f almost everywhere on /, 
and 

b) lim B _ „ jV s„ exists. 

The sequence {j„} is said to generate /. The integral of / over I is defined by the 
equation 
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We denote by L(I) the set of all functions / of the form /=« — », where 
u 6 U(T) and v e U(I). Each function / in L(I) is said to be Lebesgue-integrable 
on I, and its integral is defined by the equation 



Since these definitions are completely analogous to the one-dimensional case, 
it is not surprising to learn that many of the theorems derived from these definitions 
are also valid. In particular, Theorems 10.5, 10.6, 10.7, 10.9, 10.10, 10.11, 10.13, 
10.14, 10.16, 10.17(a) and (c), 10.18, and 10.19 are all valid for multiple integrals. 
Theorem 10.17(b), which describes the behavior of an integral under expansion or 
contraction of the interval of integration, needs to be modified as follows : 

If / e L(l) and if g(x) — /(x/c), where c > 0, then g e L(cl) and 


f tf-cf/. 

J cl Jl 

In other words, expansion of the interval by a positive factor c has the effect of 
multiplying the integral by c", where n is the dimension of the space. 

The Levi convergence theorems (Theorems 10.22 through 10.26), and the 
Lebesgue dominated convergence theorem (Theorem 10.27) and its consequences 
(Theorems 10.28, 10.29, and 10.30) are also valid for multiple integrals. 

notation. The integral J/ / is also denoted by 

J /(x) dx or J /(*!, . . . , x„)d(x u .... x n ). 

The notation j I f(x 1 ,...,x n )dx 1 ---dx n is also used. Double integrals are 
sometimes written with two integral signs, and triple integrals with three such signs, 
thus : 


t f 

t 

J « 


f(x, y ) dx dy. 


111 


/(x, y, z) dx dy dz. 


15.4 MEASURABLE FUNCTIONS AND MEASURABLE SETS IN R" 

A real-valued function / defined on an interval / in R" is called measurable on /, 
and we write / e M(I), if there exists a sequence of step functions {j„} on I such 
that 

lim s„(x ) = /(x) a.e. on I. 

n~+ oo 

The properties of measurable functions described in Theorems 10.35, 10.36, and 
10.37 are also valid in this more general setting. 
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A subset S of R" is called measurable if its characteristic function % s is measur- 
able. If, in addition, %s is Lebesgue-integrable on R", then the /(-measure p(S) of 
the set S is defined by the equation 

M$) = f Xs- 

J R" 

If Xs is measurable but not in Z,(R"), we define p(S) = +oo. The function p so 
defined is called /(-dimensional Lebesgue measure. 

The properties of measure described in Theorems 10.44 through 10.47 are also 
valid for /(-dimensional Lebesgue measure. Also, the Lebesgue integral can be 
defined for arbitrary subsets of R" by the method used in Section 10.19. 

We emphasize in particular the countably additive property of Lebesgue 
measure described in Theorem 10.47 : 

If {A lt A 2 , . . . } is a countable disjoint collection of measurable sets in R", 
then the union (J,® t A t is measurable and 



M^.)- 


The next theorem shows that every open subset of R" is measurable. 


Theorem 15.1. Every open set S in R" can be expressed as the union of a countable 
disjoint collection of bounded cubes whose closure is contained in S. Therefore S is 
measurable. Moreover, if S is bounded, then p(S) is finite. 


Proof. Fix an integer m > 1 and consider all half-open intervals in R 1 of the form 



for k — 0, + 1 , + 2, . . . 


All the intervals are of length 2 -m , and they form a countable disjoint collection 
whose union is R 1 . The cartesian product of n such intervals is an /(-d im ensional 
cube of edge-length 2~ m . Let F m denote the collection of all these cubes. Then F m 
is a countable disjoint collection whose union is R". Note that the cubes in F m + , 
are obtained by bisecting the edges of those in F m . Therefore, if Q m is a cube in F m 
and if Q m+1 is a cube in F m+1 , then either Q m+l £ Q m , or Q m+1 and Q m are 
disjoint. 

Now we extract a subcollection G m from F m as follows. If m — 1 ,G t consists 
of all cubes in F t whose closure lies in S. If m = 2, G 2 consists of all cubes in F 2 
whose closure lies in S but not in any of the cubes in G 2 . If m — 3, G 3 consists 
of all cubes in F 3 whose closure lies in S but not in any of the cubes in G t or G 2 , 
and so on. The construction is illustrated in Fig. 15.1 where S is a quarter of an 
open disk in R 2 . The blank square is in G t , the lightly shaded ones are in G 2 , 
and the darker ones are in G 3 . 

Now let 

r= G u Q- 

m= 1 QeG m 



Th. 15.1 


Fubini’s Reduction Theorem 





That is, T is the union of all the cubes in G u G 2 , ■ ■ ■ We will prove that S = T 
and this will prove the theorem because T is a countable disjoint collection of 
cubes whose closure lies in S. Now T £ S because each Q in G m is a subset of S. 
Hence we need only show that S £ T. 

Let p = (Pi, ... ,p n ) be a point in S. Since S is open, there is a cube with 
center p and edge-length S > 0, which lies in S. Choose m so that 2~ m < 5/2. 
Then for each i we have 

5 1 ,1 5 

Pi--<Pi--<Pi<P, + ^,<Pi + -- 

Now choose k t , so that 

k, k t + 1 

— < Pi < — , 

2 m 2 m 

and let Q be the Cartesian product of the intervals (k t 2~ m , ( k t + l)2 -m ] for 
/ = 1,2 Then p e Q for some cube Q in F m . If m is the smallest integer 
with this property, then Q e G m , so p e T. Hence S £ T. The statements about 
the measurability of S follow at once from the countably additive property of 
Lebesgue measure. 

note. If S is measurable, so is R" — S because Xr»-s = 1 — Xs- Therefore, 
every closed subset of R" is measurable. 


15.5 FUBINI’S REDUCTION THEOREM FOR THE DOUBLE INTEGRAL OF 
A STEP FUNCTION 

Up to this point, Lebesgue theory in R" is completely analogous to the one- 
dimensional case. New ideas are required when we come to Fubini’s theorem for 
calculating a multiple integral in R” by iterated lower-dimensional integrals. To 
better understand what is needed, we consider first the two-dimensional case. 

Let us recall the corresponding result for multiple Riemann integrals. If 
/ = [a, x [c, d~\ is a compact interval in R 2 and if / is Riemann-integrable 
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on 7, then we have the following reduction formula (from part (v) of Theorem 
14.6): 


f fix, >0 d(x, y ) 




There is a companion formula with the lower integral replaced by the upper 

integral Jj, and there are two similar formulas with the order of integration re- 
versed. The upper and lower integrals are needed here because the hypothesis of 
Riemann-integrability on 7 is not strong enough to ensure the existence of the 
one-dimensional Riemann integral \ b a f{x, y) dx. This difficulty does not arise in 
the Lebesgue theory. Fubini’s theorem for double Lebesgue integrals gives us the 
reduction formulas 


j fix, y) d(x, y) = nr fix, y) rfxj dy = f[f fix, y) dyl dx, 

under the sole hypothesis that / is Lebesgue-integrable on /. We will show that the 
inner integrals always exist as Lebesgue integrals. This is another example illus- 
trating how Lebesgue theory overcomes difficulties inherent in the Riemann theory. 

In this section we prove Fubini’s theorem for step functions, and in a later 
section we extend it to arbitrary Lebesgue integrable functions. 


Theorem 15.2. (Fubim’s theorem for step functions). Let s be a step function on 
R. Then for each fixed y in R 1 the integral J R i six, y) dx exists and, as a function 
of y, is Lebesgue-integrable on R 1 . Moreover, we have 

dx 1 dy. (4) 


s(x, y) dix, y) = 




I 

R 1 Dl 

Lv * 


six, y) 


Similarly, for each fixed x in R 1 the integral J R i s(x, y) dy exists and, as a function 
of x, is Lebesgue-integrable on R 1 . Also, we have 



six, y) dix, y) 



six, y) dy dx. 



Proof. This theorem can be derived from the reduction formula (3) for Riemann 
integrals, but we prefer to give a direct proof independent of the Riemann theory. 

There is a compact interval I = [a, 6] x [c, <f\ such that s is a step function 
on 7 and j(x, y) — 0 if (x, y) e R 2 — I. There is a partition of I into mn sub- 
rectangles 7 y = [x,.!, xj x \_yj-i, yf\ such that s is constant on the interior of 
lip say 

six, y) = Cij if (x, y) e int 7 y . 
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Then 



s(x, y) d(x, y ) = c 0 (x, - x,.,)^,. 




s(x, y) dx 


] 



Summing on i and j we find 



s(x, y) d(x, y) 



Since s vanishes outside /, this proves (4), and a similar argument proves (5). 

To extend Fubini’s theorem to Lebesgue-integrable functions we need some 
further results concerning sets of measure zero. These are discussed in the next 
section. 


15.6 SOME PROPERTIES OF SETS OF MEASURE ZERO 

Theorem 15.3. Let S be a subset of HP . T hen S has n-measure 0 if, and only if, there 
exists a countable collection of n-dimensional intervals {J u J 2 , . . . }, the sum of 
whose n-measures is finite, such that each point in S belongs to J k for infinitely 
many k. 

Proof Assume first that S has n-measure 0. Then, for every m £: 1, S can be 
covered by a countable collection of n-dimensional intervals {/ ml , / m>2 , . . . }, the 
sum of whose n-measures is <2 -m . The set A consisting of all intervals I mJk for 
m = 1, 2, ... , and k = 1, 2, . . . , is a countable collection which covers S, and 
the sum of the n-measures of all these intervals is < Y_ _ , 2~ m = 1. Moreover, 
if a 6 S then, for each m, a e I mk for some k. Therefore if we write 
A = {J u J 2 , . ■ . }, we see that a belongs to J k for infinitely many k. 

Conversely, assume that there is a countable collection of n-dimensional 
intervals {J t , J 2 , . . .} such that the series 71” ,, p(J k ) converges and such that each 
point in S belongs to J k for infinitely many k. Given e > 0, there is an integer N 
such that 

00 

Z) K J k) < e. 

k = N 

Each point of S lies in the set (J” = * J k , so S £ (J” = * J k . Thus, S has been 
covered by a countable collection of intervals, the sum of whose n-measures is 
<e, so S has n-measure 0. 

Definition 15.4. If S is an arbitrary subset of HP, and if {x, y) 6 R 2 , we denote by 
S y and S x the following subsets of R 1 : 

S y = {x : x e R 1 and (x, y) e S), 

S x ={y:yeR 1 and (x, y) e S}. 
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Examples are shown in Fig. 15.2. Geometrically, S y is the projection on the 
x-axis of a horizontal cross section of S ; and S x is the projection on the y-axis of a 
vertical cross section of S. 


Theorem 15 J. If S is a subset of R 2 with 2-measure 0, then S y has l-measure 0 for 
almost all y in R 1 , and S x has l-measure 0 for almost all x in R 1 . 

Proof We will prove that S y has l-measure 0 for almost all y in R 1 . The proof 
makes use of Theorem 15.3. 

Since S has 2-measure 0, by Theorem 15.3 there is a countable collection of 
rectangles {/*} such that the series 

CX) 

^2 n(h) converges, (6) 

k= 1 

and such that every point ( x , y) of S belongs to I k for infinitely many k. Write 
4 = X k x Y k , where X k and Y k are subintervals of R 1 . Then 

Kh) = H(XMY k ) = fi(X k ) f = | KX k ) Xr k , 

J ■> Jr 1 

where Xr k is the characteristic function of the interval Y k . Let g k = fi(X k )x Yk - 
Then (6) implies that the series 



converges. 


Now {#*} is a sequence of nonnegative functions in L( R 1 ) such that the series 
7T - j J R i g k converges. Therefore, by the Levi theorem (Theorem 10.25), the series 
y”_ t g k converges almost everywhere on R 1 . In other words, there is a subset 
T of R 1 of l-measure 0 such that the series 


CX) 

KX k )Xr k (y) converges for all y in R 1 - T. 

k= 1 


( 7 ) 
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Take a point y in R 1 — T, keep y fixed and consider the set S y . We will prove tha t 
S y has 1 -measure zero. 

We can assume that S y is nonempty; otherwise the result is trivial. Let 

A(y) = {X k : y e Y k , k = 1,2,...}. 

Then A(y) is a countable collection of one-dimensional intervals which we relabel 
as {J k , J 2 , . . . }. The sum of the lengths of all the intervals J k converges because of 
(7). If x e S y , then (x, y) e S so (x, y) e I k = X k x Y k for infinitely many k, and 
hence x e J k for infinitely many k. By the one-dimensional version of Theorem 
15.3 it follows that S y has 1 -measure zero. This shows that S y has 1 -measure zero 
for almost all y in R 1 , and a similar argument proves that S x has 1-measure zero 
for almost all x in R 1 . 


15.7 FUBINFS REDUCTION THEOREM FOR DOUBLE INTEGRALS 
Theorem 15.6 . Assume f is Lebesgue-integrable on R 2 . Then we have: 

a) There is a set T of \-measure 0 such that the Lebesgue integral f»i fix, v) dx 

exsits for all y in R 1 - T. 

/ 

b) The function G defined on R 1 by the equation 


IT ax , 

= < j*> 


G(y) =■ 'jjK 

o 

is Lebesgue-integrable on R 1 . 


y) dx ify e R 1 - T, 

ify e T, 


c) 


JK 

*2 


G(y) dy. That is. 



K 2 


fix, y) d(x, y) = 1. [L f(x, y) dx] dy. 


J 


note. There is a corresponding result which concludes that 



fix, y) dix, y) 



dx. 


Proof. We have already proved the theorem for step functions. We prove it next 
for upper functions. If/ e U(R 2 ) there is an increasing sequence of step functions 
{s B } such that s„ix, y) -*• f(x, y) for all (x, y) in R 2 — S, where S is a set of 2- 
measure 0; also, 


lim 

00 


n 


R2 


y) dix, y) = j* J* f(x, y) d(x, y). 

R2 
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Now (x, y) e R 2 — S if, and only if, xeR 1 - S y . Hence 

s„(x, y ) -> f(x, y ) if x e R 1 - S y . (8) 


^ *n(y) = s n (x, y) dx. This integral exists for each real y and is an integrable 
function of y. Moreover, by Theorem 1 5.2 we have 


f *«(y) dy 



s„(x, y) dx\ dy = Jj 

R 2 


s„(x, y) d(x, y) 



Since the sequence {r B } is increasing, the last inequality shows that lim B _ a, t„{y) dy 
exists. Therefore, by the Levi theorem (Theorem 10.24) there is a function t in 
L(R X ) such that t n ~* t almost everywhere on R 1 . In other words, there is a set 
Tj of 1 -measure 0 such that t n {y) -*■ t(y) if y e R 1 — T v Moreover, 


t«(y) dy. 


if y e R 1 - T v 

Applying the Levi theorem to {$„} we find that if y e R 1 — T t there is a function 
g in L( R 1 ) such that s„(x, y) -* g(x, y) for x in R 1 — A, where A is a set of 1- 
measure 0. (The set A depends on y.) Comparing this with (8) we see that if 
y e R 1 - T, then 

g(x, y) = f(x, y) if x e R 1 - (A u S y ). (9) 




t(y) dy 


m 

= lim 

n~* oo gi 


Again, since {/„} is increasing, we have 

t n (y) = f *n(x, y) dx < t(y) 


But A has 1-measure 0 and S y has 1-measure 0 for almost all y, say for all y in 
R 1 — T 2 , where T 2 has 1-measure 0. Let T — u T 2 . Then Thas 1-measure 0. 
If y 6 R 1 — T, the set A u S y has 1 -measure 0 and (9) holds. Since the integral 
J R i g(x, y) dx exists if y e R 1 — T it follows that the integral J Rt f{x, y) dx also 
exists if y e R 1 — T. This proves (a). Also, if y 6 R 1 — T we have 


f(x, y) dx = g(x, y) dx = lim s„(x, y) dx = f(y). 
Jr» Jr» "■“*> Jr 1 


( 10 ) 


Since t e L(R*), this proves (b). Finally, we have 


L 


= li 
Jr* 


t(y) dy = lim t n (y) dy = 


00 


= lim f t n 

n-* oo Jgi 


(y) dy 


= lim s n (x, y) dx dy 

Jr 1 LJk 1 J 


lim 

n~* oo 


s„(x, y) d(x, y) 


R 


y) d(x, y). 
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Comparing this with (10) we obtain (c). This proves Fubini’s theorem for upper 
functions. 

To prove it for Lebesgue-integrable functions we write f — u — v, where 
u e L(R 2 ) and v e L( R 2 ) and we obtain 


ff /= ff“- ff°- f rf <**)*i*-r rr **. ?) <**] ^ 

J r5 ! J jl J J Jr* Ur 1 J Jr* Ur* J 


= J jj* {«(*, y) - v(x, >0} dx 1 dy = J* FJ" f(x. 


y) dx 1 dy. 


As an immediate corollary of Theorem 15.6 and the two-dimensional analog 
of Theorem 10. 1 1 wfe obtain : 


Theorem 15.7. Assume that f is defined and bounded on a compact rectangle 
I = [a, £>] x [c, rf], and that f is continuous almost everywhere on L Then f e L(I) 
and we have 



fix, y) d(x, y ) 


m: 


r b r r d 

f(x, y) dx dy = | I 


fix, y) dy dx. 


note. The one-dimensional integral fix, y) dx exists for almost all y in [c, d~\ 
as a Lebesgue integral. It need not exist as a Riemann integral. A similar remark 
applies to the integral fix, y) dy. In the Riemann theory, the inner integrals 
in the reduction formula must be replaced by upper or lower integrals. (See 
Theorem 14.6, part (v).) 


There is, of course, an extension of Fubini’s theorem to higher-dimensional 
integrals. If f is Lebesgue-integrable on R m+k the analog of Theorem 15.6 
concludes that 


f / = f I" f fix; y) dxl dy = f [" f fix; y) dy] dx. 

Ja m+ k J n k LJ* m J Jr™ LJR k J 

Here we have written a point in R m+k as (x; y), where xeR" and y e R*. This 
can be proved by an extension of the method used to prove the two-dimensional 
case, but we shall omit the details. 


15.8 THE TONELLI-HOBSON TEST FOR INTEGRABILITY 


Which functions are Lebesgue-integrable on R 2 ? The next theorem gives a useful 
sufficient condition for integrability. Its proof makes use of Fubini’s theorem. 

Theorem 15.8. Assume that f is measurable on R 2 and assume that at least one of 
the two iterated integrals 



I fix, y)| dx dy 


or 



I fix, y) I dy dx. 
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exists. Then we have: 

a) fe L(R 2 ). 

dx 

Proof. Part (b) follows from part (a) because of Fubini’s theorem. We will also 
use Fubini’s theorem to prove part (a). Assume that the iterated integral 
j» a.. | fix, y)| dx\ dy exists. Let {$„} denote the increasing sequence of nonneg- 
ative step functions defined as follows : 




s„(x, y ) 


n if |x| < n and | y| < n, 
0 otherwise. 


Let f n (x, y) = min {^(x, y), \ fix, y)|}. Both s„ and |/| are measurable so f„ is 
measurable. Also, we have 0 < f„(x, y) < s„(x, y), so f„ is dominated by a 
Lebesgue-integrable function. Therefore,/, e £(R 2 ). Hence we can apply Fubini’s 
theorem to/, along with the inequality 0 < f n (x, y) < | /(x, y)| to obtain 


V V 

R2 



I fix, y)| dx 





Since {/,} is increasing, this shows that the limit lim„_ 00 /„ exists. By the Levi 

theorem (the two-dimensional analog of Theorem 10.24), {/„} converges almost 
everywhere on R 2 to a limit function in L(R 2 ). But/(x, y) — > \f(x, y)| as n -* oo, 
so |/| e L(R 2 ). Since / is measurable, it follows that / 6 L(R 2 ). This proves (a). 
The proof is similar if the other iterated integral exists. 


15.9 COORDINATE TRANSFORMATIONS 


One of the most important results in the theory of multiple integration is the 
formula for making a change of variables. This is an extension of the formula 


’»(<!) 


9(c) 


f(x) dx = f f[g(t)~\g'(t) dt. 


which was proved in Theorem 7.36 for Riemann integrals under the assumption 
that g has a continuous derivative g' on an interval T = [c, d~\ and that / is 
continuous on the image g(T). 

Consider the special case in which g' is never zero (hence of constant sign) on 
T. If g' is positive on T, then g is increasing, so g(c) < g(d), g(T) = [_g{c), g{d)~\, 
and the above formula can be written as follows: 


f fix ) dx = 




g(T ) 


f fig{ty\g'{t) dt. 
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On the other hand, if g' is negative on T, then g(T) = [g(d), g(cf\ and the above 
formula becomes 


f f(x) dx = |* /0(f)M0 dt. 

Ji(T) Jr 

Both cases are included, therefore, in the single formula 

f fix) dx = f f\g(t)"\ \g'(t)\ dt. (11) 

J9(T) Jt 

Equation (1 1) is also valid when c > d, and it is in this form that the result will be 
generalized to multiple integrals. The function g which transforms the variables 
must be replaced by a vector-valued function called a coordinate transformation 
which is defined as follows. 

Definition 15.9. Let T be an open subset ofR". A vector-valued function g : T -* R" 
is called a coordinate transformation on T if it has the following three properties: 

a) g e C' on T. 

b) g is one-to-one on T. 

c) The Jacobian determinant J t i t) = det Dg(t) ^ 0 for all t in T. 

note. A coordinate transformation is sometimes called a diffeomorphism. 

Property (a) states that g is continuously differentiable on T. From Theorem 
13.4 we know that a continuously differentiable function is locally one-to-one 
near each point where its Jacobian determinant does not vanish. Property (b) 
assumes that g is globally one-to-one on T. This guarantees the existence of a 
global inverse g _1 which is defined and one-to-one on the image g(7 T ). Properties 
(a) and (c) together imply that g is an open mapping (by Theorem 13.5). Also, g~ 1 
is continuously differentiable on g(T) (by Theorem 13.6). 

Further properties of coordinate transformations will be deduced from the 
following multiplicative property of Jacobian determinants. 

Theorem 15.10 (Multiplication theorem for Jacobian determinants). Assume that g 
is differentiable on an open set T in R" and that h is differentiable on the image g(7’). 
Then the composition k = h ° g is differentiable on T, and for every t in T we have 

4(t) = -/h[g(t)K(t). (12) 

Proof. The chain rule (Theorem 12.7) tells us that the composition k is differen- 
tiable on T, and the matrix form of the chain rule tells us that the corresponding 
Jacobian matrices are related as follows : 

Dk(t) = Dh[g(t)]Dg(t). (13) 

From the theory of determinants we know that det ( AB ) = det A det B, so (13) 
implies (12). 
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This theorem shows that if g is a coordinate transformation on T and if h is a 
coordinate transformation on g(T), then the composition k is a coordinate trans- 
formation on T. Also, if h = g -1 , then 

k(t) = t for all t in T, and / k (t) = 1, 

so •4[g(t)]-/,(t) = 1 and g 1 is a coordinate transformation on g(r). 

A coordinate transformation g and its inverse g" 1 set up a one-to-one corre- 
spondence between the open subsets of T and the open subsets of g (T), and also 
between the compact subsets of T and the compact subsets of g(T). The following 
examples are commonly used coordinate transformations. 

Example 1. Polar coordinates in R 2 . In this case we take 

T — {(fj, t 2 ) '»ti > 0 9 0 < t 2 < 2tt}, 

and we let g = (g l9 g 2 ) be the function defined on T as follows: 

0i(t) = ti cos t 2 , g 2 (t) = 1 1 sin t 2 . 

It is customary to denote the components of t by (r, 0) rather than (t l9 t 2 ). The co- 
ordinate transformation g maps each point (r 9 0) in T onto the point (x, y ) in g(T) given 
by the familiar formulas 

x — r cos 0, y — r sin 0 . 

The image g(T) is the set R 2 - {(*, 0) : x > 0}, and the Jacobian determinant is 

cos 0 sin 0 
— r sin 0 r cos 0 

Example 2. Cylindrical coordinates in R 3 . Here we write t = (r, 0, z) and we take 

T = {(r, 0, z):r > 0, 0 < 0 < 2n, - 00 < z < + 00 }. 

The coordinate transformation g maps each point (r, 0 9 z) in T onto the point (x, y, z) 
in g (T) given by the equations 

x = r cos 0 9 y = r sin 0 9 z = z . 





x 
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The image g(D is the set R 3 - {(*, 0, 0) : x > 0}, and the Jacobian determinant is 
given by 


cos 9 



— r sin 9 
0 


sin 9 0 

r cos 9 0 
0 1 


r. 


The geometric significance of r, 0 , and z is shown in Fig. 15.3. 


Example 3. Spherical coordinates in R 3 . In this case we write t = (p, 9, <p) and we take 

T = {(p, 0, <p):p > 0, 0 < 0 < 2n, 0 < <p < n}. 

The coordinate transformation g maps each point (p, 9 , (p) in Tonto the point (x 9 y, z) 
in 8 (T) given by the equations 

x = p cos 9 sin <p 9 y - p sin 0 sin <p, z = p cos <p. 

The image g(T) is the set R 3 - [{(jc, 0, 0) : x > 0} u {(0, 0, z) : z e R}], and the 
Jacobian determinant is 



cos 9 sin q> 

— p sin 9 sin <p 
p cos 0 cos <p 


sin 9 sin q> 
p cos 9 sin <p 
p sin 9 cos q> 


cos <p 
0 

— p sin <p 


The geometric significance of p, d, and <p is shown in Fig. 15.4. 


p 2 sin <p. 


z 



Figure 15.4 


Example 4. Linear transformations in R". Let g : R" -» R" be a linear transformation 
represented by a matrix {a u ) = m(g), so that 

^ • • • > • 

Then g = (g 1 , . . . , g n ) where gff) — X"= 1 a ijtj, and the Jacobian matrix is 

Dg(t) = (D jgi ( t)) = (a u ). 

Thus the Jacobian determinant J f (t) is constant, and equals det ( a tJ ), the determinant of 
the matrix (fly). We also call this the determinant of g and we write 

det g = det (a u ). 
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A linear transformation g which is one-to-one on R" is called nonsingular . 
We shall use the following elementary facts concerning nonsingular transforma- 
tions from R" to R n . (Proofs can be found in any text on linear algebra; see also 

Reference 14.1.) 

A linear transformation g is nonsingular if, and only if, its matrix A — m( g) 
has an inverse A' 1 such that A A ~ 1 = /, where / is the identity matrix (the matrix 
of the identity transformation), in which case A is also called nonsingular. An 
n x n matrix A is nonsingular if, and only if, det A * 0. Thus, a linear function 

g is a coordinate transformation if, and only if, det g ^ 0. 

Every nonsingular g can be expressed as a composition of three special types 
of nonsingular transformations called elementary transformations , which we refer 
to as types a, b , and c. They are defined as follows : 

Type a: g a (t u . . . , t k9 . . . , t n ) = (t u . . . , Xt k9 . . . , t n \ where 2*0. In other 
words, gfl multiplies one component of t by a nonzero scalar 2. In particular, g„ 
maps the unit coordinate vectors as follows : 

g^u*) = 2u k for some k 9 g fl (u f ) = u t - for all i * k. 

The matrix of & can be obtained by multiplying the entries in the kth row of the 
identity matrix by 2. Also, det g fl = 2. 

Type b: &(*!, . ... 'f t , .... O = (*i. O, where j ± A. Thus, 

g 6 replaces one component of t by itself plus another. In particular, g 6 maps the 
coordinate vectors as follows : 


g fc (u k ) = u k + u, for some fixed k and j, k # j, 
gj,(u,) = u, for all i # k. 

The matrix g fc can be obtained from the identity matrix by replacing the Ath row 
of I by the Ath row of / plus the jth row of I. Also, det g 6 = 1 . 


Type ci gc(ti, . . . , tj, . . . , tj 9 . . . , — (ti» ...» ty, . . . , tj, . . . , t n ), where i ^ /• 

That is, g c interchanges the ith and y'th components of t for some i and j with 
i # j. In particular, g(u f ) = u Jf g(u ,) = u i( and g(u k ) = u* for all A # i, A ^ j. 
The matrix of g c is the identity matrix with the ith andy'th rows interchanged. In 

this case det g c = — 1 . 

The inverse of an elementary transformation is another of the same type. The 
matrix of an elementary transformation is called an elementary matrix. Every 
nonsingular matrix A can be transformed to the identity matrix I by multiplying 
A on the left by a succession of elementary matrices. (This is the familiar Gauss- 
Jordan process of linear algebra.) Thus, 

I = T t T 2 • • • T r A, 


where each T k is an elementary matrix. Hence, 
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If A = m( g), this gives a corresponding factorization of g as a composition of 
elementary transformations. 


15.10 THE TRANSFORMATION FORMULA FOR MULTIPLE INTEGRALS 

The rest of this chapter is devoted to a proof of the following transformation 
formula for multiple integrals. 

Theorem 15.11. Let T be an open subset of R" and let g be a coordinate transfor- 
mation on T. Let f be a real-valued function defined on the image g(T) and assume 
that the Lebesgue integral J g(r) /(x) dx exists. Then the Lebesgue integral 
JV/[g(t)] l^ g (t)l dt also exists and we have 


fix) dx 


S(T) 


/few] \m\ dt. 



The proof of Theorem 15.11 is divided into three parts. Part 1 shows that the 
formula holds for every linear coordinate transformation «. As a corollary we 
obtain the relation 


/i[«(^)] = |det «| j u(A), 

! 

for every subset A of R" with finite Lebesgue measure. In part 2 we consider a 
general coordinate transformation g and show that (14) holds when / is the 
characteristic function of a compact cube. This gives us 

p(K) = f \J t (t)\ dt, (15) 

for every compact cube K in g (T). This is the lengthiest part of the proof. In part 
3 we use Equation (15) to deduce (14) in its general form. 


15.11 PROOF OF THE TRANSFORMATION FORMULA FOR LINEAR 
COORDINATE TRANSFORMATIONS 

Theorem 15.12. Let a : R" — > R" be a linear coordinate transformation. If the 
Lebesgue integral j R „/(x) dx exists, then the Lebesgue integral j R „/[a(t)] |/„(t)| dt 
also exists, and the two integrals are equal. 


Proof. First we note that if the theorem is true for a and fi, then it is also true for 
the composition y = a ° ft because 


f fix) dx 


/[«(t)] |J*(t)| dt 

R" 


/MW)]) i^(t)i dt 


/[y(t)] \J 7 (t)\ dt, 

R n 


since 7 y (t) = 7«[/J(t)] 7 p (t). 
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Therefore, since every nonsingular linear transformation a is a composition 
of elementary transformations, it suffices to prove the theorem for every elemen- 
tary transformation. It also suffices to assume / ^ 0. 

Suppose a is of type a. For simplicity, assume that at multiplies the last 
component of t by a nonzero scalar X, say 

• • • * /ji) (^1> • • • 9 tn— 1» Xt 

Then |/ a (t)| = |det oc| = |A|. We apply Fubini’s theorem to write the integral of / 
over R" as the iteration of an (n — l)-dimensional integral over R n_1 and a one- 
dimensional integral over R 1 . For the integral over R 1 we use Theorem 10.17(b) 
and (c), and we obtain 



/(x) dx 


L K 
L.K 

ni> 

f /[« 


/(*!, . . . , x„) dx„ dx, • • • dx n - 1 


n 


/(^i) • • * , %n— Xt t |) dt n dx | * dx n — | 




(*)] !•/«(*)! dt n dt, • • • dt„- 1 


(t)] |7.(t)| dt, 


where in the last step we use the Tonelli-Hobson theorem. This proves the theorem 
if a is of type a. If a is of type b, the proof is similar except that we use Theorem 
10.17(a) in the one-dimensional integral. In this case |/ a (t)| = 1. Finally, if a is 
of type c we simply use Fubini’s theorem to interchange the order of integration 
over the /th and y'th coordinates. Again, |/ a (t)| = 1 in this case. 

As an immediate corollary we have: 


Theorem 15.13. If at : R" - R" is a linear coordinate transformation and if A is 
any subset of R" such that the Lebesgue integral J a(X) /(x) dx exists, then the 
Lebesgue integral J^y[oe(t)] |/„(t)| dt also exists, and the two are equal. 

Proof. Let/(x) = /(x) if x e at 04), and let /(x) = 0 otherwise. Then 

j* /(x) dx = j* /(x) dx = j* /[«(t)] |/ a (t)| dt = j* /[«(t)] \J x (t)\ dt. 

J x(A) JR" JR" JA 

As a corollary of Theorem 15.13 we have the following relation between the 
measure of A and the measure of at 04). 
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Theorem 15.14 . Let a : R" R" be a linear coordinate transformation. If A is a 
subset of R" with finite Lebesgue measure n(A), then &(A) also has finite Lebesgue 
measure and 

/*[*U)] = |det a| n(A). (16) 

Proof. Write A = a J (5), where B = &(A). Since a -1 is also a coordinate 
transformation, we find 




|det a *| dt = |det a '| p(B). 


This proves (16) since B = x(A) and det (a -1 ) = (det a) -1 . 

Theorem 15.15. If A is a compact Jordan-measurable subset of R", then for any 
linear coordinate transformation a : R" -+ R" the image cc(A) is a compact Jordan- 
measurable set and its content is given by 


c[a(j4)] = |det a| c(A). 

Proof The set a(A) is compact because a is continuous on A. To prove the 
theorem we argue as in the proof of Theorem 15.14. In this case, however, all the 
integrals exist both as Lebesgue integrals and as Riemann integrals. 


15.12 PROOF OF THE TRANSFORMATION FORMULA FOR THE 
CHARACTERISTIC FUNCTION OF A COMPACT CUBE 

This section contains part 2 of the proof of Theorem 1 5. 1 1 . Throughout the 
section we assume that g is a coordinate transformation on an open set T in R". 
Our purpose is to prove that 


KK) = \J t (T)\ dt, 

J f- l W 

for every compact cube K in T. The auxiliary results needed to prove this formula 
are labelled as lemmas. 

To help simplify the details, we introduce some convenient notation. Instead 
of the usual Euclidean metric for R" we shall use the metric d given by 

d(\, y) = max |x, - yj. 

1 

This metric was introduced in Example 9, Section 3.13. In this section only we 
shall write ||x — y|| for d(x, y). 

With this metric, a ball B( a; r) with center a and radius r is an /{-dimensional 
cube with center a and edge-length 2r; that is, B( a; r) is the cartesian product of 
n one-dimensional intervals, each of length 2r. The measure of such a cube is 
(2r) B , the product of the edge-lengths. 
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If a : R" -> R" is a linear transformation represented by a matrix (a tJ ), so that 



a(x)|| = max 


1 <i<n 


n 


H °ij X j 

j = 1 


n 


< 


max Y 

1 <i£n j= 1 


a 


ij 


(17) 


We also define 

n 

II a II = max Y |a y |. (18) 

1 <i<n j= 1 

This defines a metric ||« — 0|| on the space of all linear transformations from 
R" to R". The first lemma gives some properties of this metric. 

Lemma 1. Let a and fi denote linear transformations from R" to R". Then we have: 

a ) Ik II = ||«(x)|| for some x with ||x|| = 1. 

b) ||«(x)|| < || a || ||x|| for all x in R". 

c) ll« ° 011 iC ||a|| || 0 ||. 

d) || 1 1| = 1, where I is the identity transformation. 


Proof Suppose that max 1:£i:£ „ X"=i \ a u\ is attained for i = p. Take x p = 1 if 
a pj > 0, x p = — 1 if a pj < 0, and xj = 0 if j # p. Then ||x|| = 1 and ||a|| = 
||«(x)||, which proves (a). 

Part (b) follows at once from (17) and (18). To prove (c) we use (b) to write 

||(«°0)(x)|| = ||«(0(x))|| < || a || ||0(x)|| < ||a|| ||0|| || x || . 

Taking x with ||x|| = 1 so that ||(a ° 0)(x)|| = ||a ° 0||, we obtain (c). 

Finally, if I is the identity transformation, then each sum £" =1 |a y | = 1 in 
(18) so ||I|| = 1. 

The coordinate transformation g is differentiable on T, so for each t in T the 
total derivative g'(t) is a linear transformation from R" to R" represented by the 
Jacobian matrix Dg(t) = {Djg-Jiffj. Therefore, taking a = g'(t) in (18), we find 

n 

||g'(t)|| = max Yj 

1 <i<n j= 1 

We note that ||g'(t)ll is a continuous function of t since all the partial derivatives 
Djg t are continuous on T. 

If Q is a compact subset of T , each function Djg t is bounded on Q ; hence 
||g'(t)|| is also bounded on Q , and we define 

A g (Q) = sup ||g'(t)|| = sup { max Y |f>^,(t)| 

teQ teQ I 1 < i < n j = 1 


( 19 ) 
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The next lemma states that the image g (0 of a cube Q of edge-length 2 r lies 
in another cube of edge-length 2 rk t (Q). 

Lemma 2. Let Q = {x : ||x — a|| < r} be a compact cube of edge-length 2 r 
lying in T. Then for each x in Q we have 

||g(x) - g(a) || < rl t (Q). (20) 

Therefore g (0 lies in a cube of edge-length 2rk t {Q). 

Proof. By the Mean-Value theorem for real-valued functions we have 

n 

9i( x ) ~ 9i( a) = Vfif,(z,) • (x - a) = ^ D J gf,(z i )(x J - - a,), 

j= i 

where z f lies on the line segment joining x and a. Therefore 

n n 

1 0.< x ) - 0i(*)l ^ 2 \Djgfrd\ I*; - Oj\ < || x - a|| \Djgfa)\ < rk (Q), 

J= i j= i 

and this implies (20). 

note. Inequality (20) shows that g (Q) lies inside a cube of content 

(2rk t (Q)f = U B (0}»c(0. 

Lemma 3 . If A is any compact Jordan-measurable subset of T, then g (A) is a com- 
pact Jordan-measurable subset of g (T). 

Proof The compactness of g (A) follows from the continuity of g. Since A is 
Jordan-measurable, its boundary dA has content zero. Also, 5(g(y4)) = g (dA), 
since g is one-to-one and continuous. Therefore, to complete the proof, it suffices 
to show that g (dA) has content zero. 

Given s > 0, there is a finite number of open intervals A l9 . . . , A m lying in 
T, the sum of whose measures is < e, such that dA Q |J?L x A { . By Theorem 15.1, 
this union can also be expressed as a union t/(e) of a countable disjoint collection 
of cubes, the sum of whose measures is < e. If e < 1 we can assume that each 
cube in U(e) is contained in t/(l). (If not, intersect the cubes in C/(e) with U( 1) and 
apply Theorem 15.1 again.) Since dA is compact, a finite subcollection of the cubes 
in U(e) covers dA, say Q x , . . . , Q k . By Lemma 2, the image g(Qi) lies in a cube of 
measure {XJQdYciQd- Let A = A g (C7(T)). Then A g (g ( ) < A since Q t c [7(1). 
Thus g (dA) is covered by a finite number of cubes, the sum of whose measures 
does not exceed A" £?= , c(Q t ) < eA". Since this holds for every s < 1 , it follows 
that g (dA) has Jordan content 0, so g(^4) is Jordan-measurable. 

The next lemma relates the content of a cube Q with that of its image g(0. 
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Lemma 4 . Let Q be a compact cube in T and let h = a <> g, where a : R" -> R" is 
any nonsingular linear transformation. Then 

c[g(0] < |det «|“ 1 {A fc (0}"c(0. (21) 

Proof. From Lemma 2 we have c[g(0] < {A g (0}"c(0. Applying this inequality 
to the coordinate transformation h, we find 

4X0] < {A h (0}"c(0. 

But by Theorem 15.15 we have c[h(0] = c[«(g(0)] = |det «| c[g(0], so 

c[g(0] = Idet >| - 1 c[h(0] <; |det a| " 1 {A h (0}"c(0. 

Lemma 5. Let Q be a compact cube in T. Then for every e > 0, there is ad > 0 
such that if t e Q and a e Q we have 

Ilg'OO -1 0 g'(0ll <l+e whenever ||t — a|| < 5. (22) 

Proof The function || g'(t) ~ 1 1| is continuous and hence bounded on Q, say 
||g'(t) -1 || < M for all t in Q where M > 0. By the continuity of ||g'(t)||, there is 
a 6 > 0 such that 

llg'(t) - g'OOII < TZ whenever ||t - a|| < d. 

M 

If I denotes the identity transformation, then 

g'(a) -1 °g'(t) - I(t) = g'(*) -1 « {g'0 - g'(*)>. 

so if ||t — a|| < d we have 

llg'(*) _1 °g'(t) - 1(011 < llg'(*) _1 ll llg'(t) - g'(*)ll < M ± = e. 

M 

The triangle inequality gives us ||a|| < ||/1|| + |[ac — fi\\. Taking 

* = g'(a) -1 °g'(t) and p = I(t), 

we obtain (22). 

Lemma 6. Let Q be a compact cube in T. Then we have 

c[g(0] < J |/ g (t)l dt. 

Proof. The integral on the right exists as a Riemann integral because the inte- 
grand is continuous and bounded on Q. Therefore, given e > 0, there is a partition 
P t of Q such that for every Riemann sum S(P, |/ g |) with P finer than P t we have 

S(P, |J g |) - f |/ g (t)| dt < 6. 

Jq 

Take such a partition P into a finite number of cubes Q u . . . , Q m , each of which 
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has edge-length < 8, where 8 is the number (depending on e) given by Lemma 5. 
Let a { denote the center of Q t and apply Lemma 4 to Q t with a = g'(a,) _1 to 
obtain the inequality 

c[g(Qd] < Idet g'(a,)| {h(Qd}* <«?,), (23) 

where h = « ° g. By the chain rule we have h'(t) = a'(x) ° g'(t). where x = g(t). 
But a'(x) = a since a is a linear function, so 

h'(t) = at o g'(t) = g'(*i) ~ 1 ° g'(0- 
But by Lemma 5 we have ||h'(t)|| < 1 + e if t e Q t , so 


Thus (23) gives us 


4(6i) = sup ||h'(t)|| <l+s. 

teQi 


<[g«2i)] < Idet g'(a ; )| (1 + 8)" cm. 

Summing over all i, we find 

m 

c[g(0] < (1 + e)" 2 l det g'(a,)l cm- 

i= 1 

Since det g'(a,) = the sum on the right is a Riemann sum S(P, |7 f |), and 

since S(P, |7 g |) < J Q |7 g (t)| dt + e, we find 


c[g(0] < (1 + e)" | 


\JJt)\ dt + 8 


But e is arbitrary, so this implies c[g(0] < J Q |/ g (t)| dt. 
Lemma 7. Let K be a compact cube in g(T’). Then 


p(K) < f |J g (t)| dt. (24) 

Proof. ' The integral exists as a Riemann integral because the integrand is con- 
tinuous on the compact set g ~ l (K). Also, by Lemma 3, the integral over g ~ 1 {K) 
is equal to that over the interior of By Theorem 15.1 we can write 


intg '(K) = U A„ 

i = 1 

where {A lt A 2 , . . .} is a countable disjoint collection of cubes whose closure lies 
in the interior of g _1 (A"). Thus, intg -1 (A} = (J,® t Q { where each Q t is the 
closure of A t . Since the integral in (24) is also a Lebesgue integral, we can use 
countable additivity along with Lemma 6 to write 

I \j,m dt = 2 f u,(t ) i dt > 2 ^[g(Si)] = //u g(e i A = kk). 

Jg-HK) i= 1 JQ, 1=1 \i=l / 
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Lemma 8 . Let K be a compact cube in g (T). Then for any nonnegative upper 
function f which is bounded on K , the integral J g -i (K) /[g(t)] |/ g (t)| dt exists , and 
we have the inequality 


/(x) dx < 


/[g(t )] \JM dt 


(25) 


g-'(X) 


Proof. Let s be any nonnegative step function on K. Then there is a partition of 
K into a finite number of cubes K l , . . . , K r such that s is constant on the interior 
of each K h say s(x) = a ( > 0 if x e int K t . Apply (24) to each cube K h multiply 
by a t and add, to obtain 


s(x) dx < 


5[g(t)] |J g (t)| dt 


(26) 


g-'W 


Now let {s*} be an increasing sequence of nonnegative step functions which 
converges almost everywhere on K to the upper function f. Then (26) holds for 
each s k , and we let k -* oo to obtain (25). The existence of the integral on the 
right follows from the Lebesgue bounded convergence theorem since both 
/[g(t)] and |/ g (t)| are bounded on the compact set g -1 (A"). 


Theorem 15.16. Let K be a compact cube in g(T). Then we have 


KK) 


■L 


\Ut)\ dt 


(27) 


(X) 


Proof. In view of Lemma 7, it suffices to prove the inequality 


L 


|J g (t)| dt < n(K). 


(28) 


(X) 


As in the proof of Lemma 7, we write 


oo 


oo 


int g -1 (X) = U A t = U Qi> 


i = 1 


i = 1 


where {A t , A 2 , . . . } is a countable disjoint collection of cubes and Q, is the closure 
of A ,. Then 


f oo 

\JM dt=J2 

r‘(X) *-! 


|J g (t)| dt. 


(29) 


Qi 


Now we apply Lemma 8 to each integral J Q( |/ g (t)| dt, taking / = |/ g | and using 
the coordinate transformation h = g _1 . This gives us the inequality 


|J g (t)| dt < 


Qi 


g(Qi) 


l- / g[ h ( u )]l \ J h(u)\ du = f du = //[g(Qi)], 

Jg(Q() 


which, when used in (29) gives (28). 
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15.13 COMPLETION OF THE PROOF OF THE TRANSFORMATION 
FORMULA 

Now it is relatively easy to complete the proof of the formula 

/(x) dx = /[g(t)] \J t (t)\ dt, (30) 

Jg(T) Jr 

under the conditions stated in Theorem 15.11. That is, we assume that T is an 
open subset of R", that g is a coordinate transformation on T, and that the integral 
on the left of (30) exists. We are to prove that the integral on the right also exists 
and that the two are equal. This will be deduced from the special case in which the 
integral on the left is extended over a cube K. 

Theorem 15.17. Let K be a compact cube in g (T) and assume the Lebesgue integral 
f K f(x) dx exists. Then the Lebesgue integral J g -i (K) /[g(t)] |/ g (t)| dt also exists, 
and the two are equal. 

Proof. It suffices to prove the theorem when / is an upper function on K. Then 
there is an increasing sequence of step functions {s*} such that s k ~* f almost 
everywhere on K. By Theorem 15.16 we have 


s k (x) dx = 


*[ g(t)] I Wl dt. 


fHX) 


for each step function s k . When k -* co, we have s k (x) dx -* J K /(x) dx. Now 
let 


Then 


f(t ) = r*^)] \ J M if * e g" \K), 

JkK )0 ifteR" - g~'(K). 


SO 


/*(t) dt = 


R" 


s*[g(t)] \JM dt = 


g-'(K) 


s*(x) dx. 


lim 

k-* oo 


f k (t) dt = lim 

||n k~> 00 


L 


s k (x) dx = 


/(x) dx. 


By the Levi theorem (the analog of Theorem 10.24), the sequence {_/^} converges 
almost everywhere on R" to a function in L(R"). Since we have 

lim MO- ( /[g< ' )] Wi 

»-« |0 ifteR" - g''(K), 


almost everywhere on R", it follows that the integral | g -i (K) /[g(t)] |7 g (t)| dt exists 
and equals J x /(x) dx. This completes the proof of Theorem 15.17. 
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Proof of Theorem 15.1 1. Now assume that the integral J f(r) /(x) dx exists. Since 
g(D is open, we can write 


g (T) = U 

i= 1 

where {A lt A 2 , . . . } is a countable disjoint collection of cubes whose closure lies 
in g(D. Let K t denote the closure of A t . Using countable additivity and Theorem 
15.17 we have 


j 

Jg(T) 


/(x) dx 


f(x) dx 


00 /* 

-2 f 

i=1 Jk, 

00 /• 

= E /[g(t >] dt 

1-1 J«-‘(K|) 

= f /[g(t)] |7 g (t)| dt 


EXERCISES 


15.1 If /e L(T), where T is the triangular region in R 2 with vertices at (0, 0), (1, 0), 
and (0, 1), prove that 

J fix, y) d(x, y) = r \[ fix, y) <afyj dx = J j^J f(x, y) dx 1 dy. 

15.2 For fixed c, 0 < c < 1, define / on R 2 as follows: 


/(*> y) 


(1 - y) c /(x - y) c if 0 < y < x 9 0 < x < 1, 
0 otherwise. 


Prove that fe L( R 2 ) and calculate the double integral J R2 f(x 9 y) d(x 9 y). 

15.3 Let S be a measurable subset of R 2 with finite measure v(S). Using the notation of 
Definition 15.4, prove that 

M(S) = f ti(S x ) dx = f°° fi(S y ) dy . 

J — oo J — oo 

15.4 Let f(x, y) = e~ xy sin x sin y if x > 0, y 2: 0, and let f(x, y) = 0 otherwise. 
Prove that both iterated integrals 

L [l /^ y) dx \ dy and L [/.. fix, y) dy \ dx 

exist and are equal, but that the double integral of / over R 2 does not exist. Also, explain 
why this does not contradict the Tonelli-Hobson test (Theorem 15.8). 
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15.5 Let /(*, y) = ( x 2 - y 2 )/(x 2 + y 2 ) 2 for 0 < x < 1, 0 < y < 1, and let/(0, 0) = 
0. Prove that both iterated integrals 



exist but are not equal. This shows that /is not Lebesgue-integrable on [0, 1 ] x [0, 1 ]. 

15.6 Let / = [0, 1 ] x [0, 1 ], let fix, y) = (x - y)l(x + y) 3 if (x, y) e I, (x, y) * 
(0, 0), and let/(0, 0) = 0. Prove that f $ L(I) by considering the iterated integrals 



15.7 Let / = [0, 1 ] x [1. + oo) and let fix, y) = - 2e~ 2xy if (x, y) e I. Prove 

that ft Iff) by considering the iterated integrals 

/ [J ^ dy \ ^ and J [/ ^ X ’ ^ dX \ dy ' 

15.8 The following formulas for transforming double and triple integrals occur in ele- 
mentary calculus. Obtain them as consequences of Theorem 15.11 and give restrictions 
on T and T' for validity of these formulas. 


a) J* |* f(x 9 y) dx dy = f(r cos 0 9 r sin 0)r dr dO . 

T T' 


J/P* 

j)i,. 


b) | | J fix, y, z ) dx dy dz = fj f fir cos 0, r sin 0, z)r dr dO dz. 

T " T' 


c) M J /(*» z ) dx d y dz 

T 

r r r 

fip cos 6 sin <p, p sin 0 sin <p, p cos q>) p 2 sin <p dp dd dip. 


= /// 


15.9 a) Prove that J r2 e ( * 2+y2) d(x 9 y) = n by transforming the integral to polar 
coordinates. 

b) Use part (a) to prove that S-oo e~ x2 dx = yin. 

c) Use part (b) to prove that J R „ M2 d(x l9 . . . , x^ = w" 72 . 

d) Use part (b) to calculate e~ tx2 dx and J«oo * 2 e~ tx2 dx 9 1 > 0. 

15.10 Let V n (a) denote the w-measure of the w-ball B(0; a) of radius a. This exercise 
outlines a proof of the formula 



TE^V 

r ftn+ 1)" 


a) Use a linear change of variable to prove that V n (a) = aPV n ( 1). 



Multiple Lebesgue Integrals 




b) Assume n > 3, express the integral for V n (l) as the iteration of an (n — 2)-fold 
integral and a double integral, and use part (a) for an (n — 2)-ball to obtain the 
formula 


K( 1) = K-i( 1) 


[2n r /»1 

Jo LJo 


(1 - r*y 


2\nf2 — 


1 r dr 


d0= ^_ 2 (i) — . 

n 


c) From the recursion formula in (b) deduce that 



jf 12 

r(in + 1)’ 


15.11 Refer to Exercise 15.10 and prove that 


j 

Jb(0;1) 


x k d(Xh • • • 9 r n ) 


yjji) 

n + 2 


for each k = 1,2 

15.12 Refer to Exercise 15.10 and express the integral for V n (l) as the iteration of an 
(n — l)-fold integral and a one-dimensional integral, to obtain the recursion formula 

V n (l) = 2F„_ 1 (1) f 1 (1 - * 2 ) ( "- 1)/2 dx. 

Put x = cos / in the integral, and use the formula of Exercise 15.10 to deduce that 



cos ' 1 1 dt 


— ^ 
2 r&n + 1 ) 


15.13 If a > 0, let S n (a) = {(^j, . . . , x n ) : |jc x | + • • • + |jc„| < a }, and let V n (a) denote 
the /i-measure of S n (a). This exercise outlines a proof of the formula V n (a) = 2' , a"/«!. 

a) Use a linear change of variable to prove that V n (a) = a"K n (l). 

b) Assume n > 2, express the integral for K„(l) as an iteration of a one-dimensional 
integral and an (n — l)-fold integral, use (a) to show that 




\x\) n 1 dx = 2K n _ x (l)//f. 


and deduce that V n (l) = 2 n /n\. 

15.14 If a > 0 and n > 2, let £„(<!) denote the following set in R": 


S n (a ) = {(^!, . . . , ^ n ) : |jC|| + |^„| < a for each / = 1, . . . , /i — 1}. 

Let V n (a) denote the /i-measure of S n (a). Use a method suggested by Exercise 15.13 to 
prove that V n (a) = 2"a"//i. 

15.15 Let Q n (a ) denote the “first quadrant” of the /i-ball B(0:a) given by 
Q n (a ) = {(^!, . . . , x n ) : ||x|| < a and 0 < x t < a for each i = 1,2,..., n }. 
Let /(x) = x t • • • x„ and prove that 



f(x) dx = 


a 


2n 


2 n n\ 
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CHAPTER 16 


CAUCHY’S THEOREM 

AND THE 
RESIDUE CALCULUS 


16.1 ANALYTIC FUNCTIONS 

The concept of derivative for functions of a complex variable was introduced in 
Chapter 5 (Section 5. 1 5). The most important functions in complex variable theory 
are those which possess a continuous derivative at each point of an open set. 
These are called analytic functions. 

Definition 16.1. Let f = u + iv be a complex-valued function defined on an open 
set S in the complex plane C. Then f is said to be analytic on S if the derivative f 
exists and is continuous* at every point of S. 

note. If T is an arbitrary subset of C (not necessarily open), the terminology 
“/is analytic on T" is used to mean that /is analytic on some open set containing 
T. In particular, /is analytic at a point z if there is an open disk about z on which 
/ is analytic. 

It is possible for a function to have a derivative at a point without being 
analytic at the point. For example, if /(z) = |z| 2 , then /has a derivative at 0 but 
at no other point of C. 

Examples of analytic functions were encountered in Chapter 5. If /(z) = z" 
(where n is a positive integer), then / is analytic everywhere in C and its derivative 
is /'(z) = nz" -1 . When n is a negative integer, the equation /(z) = z" if z / 0 
defines a function analytic everywhere except at 0. Polynomials are analytic 
everywhere in C, and rational functions are analytic everywhere except at points 
where the denominator vanishes. The exponential function, defined by the formula 
e 2 = e*(cos y + i sin y), where z = x + iy, is analytic everywhere in C and is 
equal to its derivative. The complex sine and cosine functions (being linear 
combinations of exponentials) are also analytic everywhere in C. 

Let /(z) = Log z if z # 0, where Log z denotes the principal logarithm of 
z (see Definition 1.53). Then /is analytic everywhere in C except at those points 
z = x + iy for which x <; 0 and y = 0. At these points, the principal logarithm 
fails to be continuous. Analyticity at the other points is easily shown by verifying 

* It can be shown that the existence of f on S automatically implies continuity of /' on 
S (a fact di&overed by Goursat in 1900). Hence an analytic function can be defined as 
one which merely possesses a derivative everywhere on S. However, we shall include 
continuity off' as part of the definition of analyticity, since this allows some of the proofs 
to run more smoothly. 
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that the real and imaginary parts of / satisfy the Cauchy-Riemann equations 
(Theorem 12.6). 

We shall see later that analyticity at a point z puts severe restrictions on a 
function. It implies the existence of all higher derivatives in a neighborhood of z 
and also guarantees the existence of a convergent power series which represents 
the function in a neighborhood of z. This is in marked contrast to the behavior of 
real-valued functions, where it is possible to have existence and continuity of the 
first derivative without existence of the second derivative. 


16.2 PATHS AND CURVES IN THE COMPLEX PLANE 

Many fundamental properties of analytic functions are most easily deduced with 
the help of integrals taken along curves in the complex plane. These are called 
contour integrals (or complex line integrals) and they are discussed in the next 
section. This section lists some terminology used for different types of curves, 
such as those in Fig. 16.1. 


or 


arc 


Jordan arc 



closed curve 



Figure 16.1 


We recall that a path in the complex plane is a complex-valued function y, 
continuous on a compact interval [a, &]. The image of [a, i] under y (the graph 
of y) is said to be a curve described by y and it is said to join the points y(a) 
and y(b). 

If y(a) # y(b), the curve is called an arc with endpoints y(a) and y(b). 

If y is one-to-one on [a, 6], the curve is called a simple arc or a Jordan arc. 

If y{a) = y(b), the curve is called a closed curve. If y(a) = y(b) and if y is 
one-to-one on the half-open interval [a, b), the curve is called a simple closed curve, 
or a Jordan curve. 

The path y is called rectifiable if it has finite arc length, as defined in Section 
6.10. We recall that y is rectifiable if, and only if, y is of bounded variation on 
[a, 6]. (See Section 7.27 and Theorem 6.17.) 

A path y is called piecewise smooth if it has a bounded derivative y' which is 
continuous everywhere on [a, b] except (possibly) at a finite number of points. 
At these exceptional points it is required that both right- and left-hand derivatives 
exist. Every piecewise smooth path is rectifiable and its arc length is given by the 
integral |/(f)| dt. 

A piecewise smooth closed path will be called a circuit. 
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Figure 16.2 


Definition 16.2 . If a e C and r > 0, the path y defined by the equation 

y(0) = a + re l0 9 0 < 0 < 2i c, 

is called a positively oriented circle with center at a and radius r. 

note. The geometric meaning of y{0) is shown in Fig. 16.2. As 0 varies from 0 
to 27 c, the point y{0) moves counterclockwise around the circle. 


16.3 CONTOUR INTEGRALS 

Contour integrals will be defined in terms of complex Riemann-Stieltjes integrals, 
discussed in Section 7.27. 


Definition 16.3. Let y be a path in the complex plane with domain [a, b~\, and let f 
be a complex-valued function defined on the graph of y. The contour integral off 
along y, denoted by J y /, is defined by the equation 


e 




y 



rt 


/[y(0] dy(t). 


whenever the Riemann-Stieltjes integral on the right exists. 


notation. We also write 


* Py (b) 

f(z) dz or /(z) dz, 

Jy J y(«) 

for the integral. The dummy symbol z can be replaced by any other convenient 
symbol. For example, J y /(z) dz = J y f{w) dw. 

If y is rectifiable, then a sufficient condition for the existence of J v /is that /be 
continuous on the graph of y (Theorem 7.27). 

The effect of replacing y by an equivalent path (as defined in Section 6.12) is, 
at worst, a change in sign. In fact, we have : 

Theorem 16.4. Let y and 8 be equivalent paths describing the same curve T. If 
J y / exists, then J a / also exists. Moreover, we have 










y 


s 
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if y and <5 trace out F in the same direction , whereas 



if y and <5 trace out F in opposite directions. 


Proof Suppose 8{t) = y[w(0] where u is strictly monotonic on [c, d\ From 
the change-of-variable formula for Riemann-Stieltjes integrals (Theorem 7.7) we 
have 

ru(d) /»d /• 

/WO] dy(t) = /WO] <*<5(0 = / (1) 

J«(c) Jc Jd 

If u is increasing then u(c) = a, w(rf) = b and (1) becomes j y f = j^/. 

If w is decreasing then w(c) = b , w(rf) = a and (1) becomes — J y / = j 3 f 

The reader can easily verify the following additive properties of contour 
integrals. 


Theorem 16.5 . Let y be a path with domain [a, b~\. 

i) If the integrals j y f and J y g exist , then the integral J y (a/ + fig) exists for every 
pair of complex numbers a, j 3, and we have 



(a/ + Pg) = a 



g- 


ii) Let y x and y 2 denote the restrictions of y to [a, c] and [c, 6], respectively , 

vvAere a < c < b. If two of the three integrals in (2) exist, then the third also exists 
and we have 



In practice, most paths of integration are rectifiable. For such paths the 
following theorem is often used to estimate the absolute value of a contour integral. 


Theorem 16.6. Let y be a rectifiable path of length A(y). If the integral J y / exists, 
an d if \f(z)\ < M for all z on the graph of y, then we have the inequality 

» 

/ < M A(y). 

y 

Proof We simply observe that all Riemann-Stieltjes sums which occur in the 
definition of J £/[y(0] dy(t) have absolute value not exceeding MA(y). 

Contour integrals taken over piecewise smooth curves can be expressed as 
Riemann integrals. The following theorem is an easy consequence of Theorem 7.8. 



7 
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Theorem 16.7. Let y be a piecewise smooth path with domain [a, b\ If the contour 
integral J y / exists, we have 

J / = J /[y(0] y'(0 dt. 

16.4 THE INTEGRAL ALONG A CIRCULAR PATH AS A FUNCTION OF 
THE RADIUS 

Consider a circular path y of radius r > 0 and center a, given by 

y(6) = a + re w , 0 <■ 6 <, 2n. 

In this section we study the integral J y /as a function of the radius r. 

Let <p(r) — J y f. Since y'(6) = ire tB , Theorem 16.7 gives us 

/(a + re ,e )ire 19 dO. (3) 

) 

As r varies over an interval [r l5 r 2 ], where 0 < r k < r 2 , the points y(6) trace out 
an annulus which we denote by A(a; r u r 2 ). (See Fig. 16.3.) Thus, 

A(a; r u r 2 ) = {z : r t ^ \z - a\ < r 2 }. 

Ifr x = 0 the annulus is a closed disk of radius r 2 . If/is continuous on the annulus, 
then q> is continuous on the interval [r l5 r 2 ]. If /is analytic on the annulus, then q> 
is differentiable on [r l5 r 2 ]. The next theorem shows that <p is constant on [r x , r 2 ] 
if/is analytic everywhere on the annulus except possibly on a finite subset, pro- 
vided that / is continuous on this subset. 




Theorem 16.8. Assume f is analytic on the annulus A(a; r lt r 2 ), except possibly at a 
finite number of points. At these exceptional points assume that f is continuous. 
Then the function (p defined by (3) is constant on the interval [r l5 r 2 ]. Moreover, 
if r i =0 the constant is 0. 

Proof Let z u . . . , z H denote the exceptional points where / fails to be analytic. 
Label these points according to increasing distances from the center, say 

|zi - a\ <. \z 2 - a\ <, • • • ^ \z„ - a\, 

and let R k = \z k — a\. Also, let R 0 = r lt R n+1 — r 


2 * 
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The union of the intervals [/?*, for k = 0, 1, 2, . . . , n is the interval 

l/i> r i\- We will show that q> is constant on each interval [7L, /?. , ."I. We write 
(3) in the form 




9 (r, 6 ) dO, 


where g(r, 0) = f(a + re i9 )ire i9 . 


An easy application of the chain rule shows that we have 



86 dr 



(The reader should verify this formula.) Continuity of /' implies continuity of the 
partial derivatives dg/dr and dgjdO. Therefore, on each open interval (R k , R k+1 ), 
we can calculate <p'(r) by differentiation under the integral sign (Theorem 7.40) 
and then use (4) and the second fundamental theorem of calculus (Theorem 7.34) 
to obtain 


*' (,) - f * * " i r I * = i Wr ' 2 "> - e(r ’ °» - °- 

Applying Theorem 12.10, we see that <p is constant on each open subinterval 
(R k , Rk+i)- By continuity, q> is constant on each closed subinterval [/?*, /? k+1 ] and 
hence on their union [r„ r 2 ]. From (3) we see that q>{r) -► 0 as r -+ 0 so the 
constant value of <p is 0 if r t = 0. 


16.5 CAUCHY’S INTEGRAL THEOREM FOR A CIRCLE 

The following special case of Theorem 16.8 is of particular importance. 

Theorem 16.9 ( Cauchy's integral theorem for a circle). If f is analytic on a disk 
B(a, R) except possibly for a finite number of points at which it is continuous, then 

J/-* 

for every circular path y with center at a and radius r < R. 

Proof. Choose r 2 so that r < r 2 < R and apply Theorem 16.8 with r t = 0. 

note. There is a more general form of Cauchy’s integral theorem in which the 
circular path y is replaced by a more general closed path. These more general paths 
will be introduced through the concept of homotopy. 


16.6 HOMOTOPIC CURVES 

Figure 16.4 shows three arcs having the same endpoints A and B and lying in an 
open region D. Arc 1 can be continuously deformed into arc 2 through a collection 
of intermediate arcs, each of which lies in D. Two arcs with this property are said 
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Figure 16.4 

to be homotopic in D. Arc 1 cannot be so deformed into arc 3 (because of the 
hole separating them) so they are not homotopic in D. 

In this section we give a formal definition of homotopy. Then we show that, if 
/ is analytic in D, the contour integral of / from A to B has the same value along 
any two homotopic paths in D. In other words, the value of a contour integral 
/ is unaltered under a continuous deformation of the path, provided the in- 
termediate contours remain within the region of analyticity of /. This property 
of contour integrals is of utmost importance in the applications of complex 
integration. 

Definition 16 JO . Let y 0 and y x be two paths with a common domain [a, &]. Assume 
that either 

a) y 0 and y t have the same endpoints : y 0 (a) = yfa) and y 0 (b) = y 1 (b), or 

b) y 0 and y t are both closed paths: y 0 (a) = y 0 (b) and yfa) = yi(b). 

Let D be a subset of C containing the graphs of y 0 and y x . Then y 0 and y x are said 
to be homotopic in D if there exists a function h, continuous on the rectangle 
[0, 1] x [a, b\ and with values in D, such that 

1) h{ 0, t) = y 0 (t) if t e [a, b ], 

2) A(l, t) = y^{t) if t e [a, 6]. 

In addition we require that for each s in [0, 1] we have 

3a) h(s, a) — yo(d) and h(s, b) = y 0 (b), in case (a); 
or 

3b) h(s, a) = h{s, b), in case (b). 

The function h is called a homotopy. 

The concept of homotopy has a simple geometric interpretation. For each 
fixed s in [0, 1], let y s (t) = h(s , t). Then y s can be regarded as an intermediate 
moving path which starts from y 0 when s = 0 and ends at y t when ^ = 1. 

Example 1. Homotopy to a point. If y 1 is a constant function, so that its graph is a single 
point, and if y 0 is homotopic to y t in D, we say that y 0 is homotopic to a point in D. 

Example 2. Linear homotopy. If, for each t in [a, b], the line segment joining y 0 (t) and 
yft) lies in D, then y 0 and y x are homotopic in D because the function 

h(s, t) = syft) + (1 - s)y 0 (t) 
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serves as a homotopy. In this case we say that y 0 and y t are linearly homotopic in D. In 
particular, any two paths with domain [a, 6] are linearly homotopic in C (the complex 
plane) or, more generally, in any convex set containing their graphs. 

note. Homotopy is an equivalence relation. 

The next theorem shows that between any two homotopic paths we can inter- 
polate a finite number of intermediate polygonal paths, each of which is linearly 
homotopic to its neighbor. 

Theorem 16.11 (Polygonal interpolation theorem). Let y 0 and y l be homotopic 
paths in an open set D. Then there exist a finite number of paths a 0 , a t , . . . , a„ such 
that: 

a) a 0 = 7o and <x„ = y u 

b) ctj is a polygonal path for 1 < j < n — 1 , 

c) a.} is linearly homotopic in D to <x j+1 for 0 < j < n — 1. 

Proof Since y 0 and y k are homotopic in D, there is a homotopy h satisfying the 
conditions in Definition 16.10. Consider partitions 

{ J o> ^i> • • • » of [0, 1] and {/o, t k , ... , /„} of [a, b~\, 

into n equal parts, choosing n so large that the image of each rectangle [j 7 -, s J+ x 
[t*> t k + 1 ] under h is contained in an open disk D jk contained in D. (The reader 
should verify that this is possible because of uniform continuity of h.) 

On the intermediate path y given by 

7sj(0 = Ksj, t) for 0 <j<n, 

we inscribe a polygonal path a>j with vertices at the points h(sj, t k ). That is, 

a j(h) = h(sj, t k ) for k = 0, 1, . . . , n, 

and ctj is linear on each subinterval [/*, t k+ J for 0 < A: < « - 1. We also define 
a o = 7o a °d = 7i- (An example is shown in Fig. 16.5.) 

The four vertices oc/f*), a j(t k+ 1 ), a J+1 (t k ), and <x J+ 1 (f t+ 1 ) all lie in the disk D Jk . 
Since D Jk is convex, the line segments joining them also lie in D jk and hence the 
points 

sa j+i(t) + (1 ~ s)oCj(t), (5) 



«o — y 0 
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lie in D jk for each (y, t ) in [0, 1] x [t k , / t+1 ]. Therefore the points (5) lie in D 
for all (s, t) in [0, 1] x [a, b~\, so a J+1 is linearly homotopic to ctj in D. 


16.7 INVARIANCE OF CONTOUR INTEGRALS UNDER HOMOTOPY 


Theorem 16.12. Assume f is analytic on an open set D, except possibly for a finite 
number of points where it is continuous. If y 0 and y k are piecewise smooth paths 
which are homotopic in D we have 



Proof First we consider the case in which y 0 and y^ are linearly homotopic. For 
each s in [0, 1] let 


y s (0 = syi(t) + (1 - s)y 0 (t) if t e [a, b]. 
Then y s is piecewise smooth and its graph lies in D. Write 


7 ft) = 7 o(t) + sa(t), where a(t) = yft) - y 0 (t), 

and define 




/[y*(0] dy 0 (t) + s 


f/WO] 


daft). 


for 0 < j < 1. We wish to prove that q>( 0) = <p(l). We will in fact prove that cp 
is constant on [0, 1]. 

We use Theorem 7.40 to calculate q>'(s ) by differentiation under the integral 
sign. Since 

j- 7ft) = ac(t), 
os 


this gives us 


9'(s) = r f'[yft)]ofO dy 0 (t) + s Ff'[yft)]at(t) daft) + f/t^O] daft) 

Ja Ja 


-I 

-f 

■f 


= | «(0/'[y*(0] dyft) + /[y s (0] daft) 

Ja 


«(t)f'[7s(t)]7ft) dt + f /[y s (0] deft) 


*(0 d{f[y s (ty]} + j * f[yft)~\da(t) 


= <*(b)fbs(b)] ~ a(a)/[y s (a)], 

by the formula for integration by parts (Theorem 7.6). But, as the reader can easily 
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verify, the last expression vanishes because y 0 and yj are homotopic, so (p'(s) = 0 
for all s in [0, 1]. Therefore <p is constant on [0,1]. This proves the theorem when 
y 0 and yj are linearly homotopic in D. 

If they are homotopic in D under a general homotopy h, we interpolate poly- 
gonal paths a.j as described in Theorem 16.1 1. Since each polygonal path is piece- 
wise smooth, we can repeatedly apply the result just proved to obtain 



16.8 GENERAL FORM OF CAUCHY’S INTEGRAL THEOREM 

The general form of Cauchy’s theorem referred to earlier can now be easily deduced 
from Theorems 16.9 and 16.12. We remind the reader that a circuit is a piecewise 
smooth closed path. 

Theorem 16.13 ( Cauchy’s integral theorem for circuits homotopic to a point) . Assume 
f is analytic on an open set D, except possibly for a finite number of points at which 
we assume f is continuous. Then for every circuit y which is homotopic to a point in 
D we have 

/= 0 . 

Proof Since y is homotopic to a point in D, y is also homotopic to a circular 
path 8 in D with arbitrarily small radius. Therefore J y f= J s f, and J ,/ = 0 by 
Theorem 16.9. 

Definition 16.14. An open connected set D is called simply connected if every closed 
path in D is homotopic to a point in D. 

Geometrically, a simply connected region is one without holes, Cauchy’s 
theorem shows that, in a simply connected region D the integral of an analytic 
function is zero around any circuit in D. 



16.9 CAUCHY’S INTEGRAL FORMULA 

The next theorem reveals a remarkable property of analytic functions. It relates 
the value of an analytic function at a point with the values on a closed curve not 
containing the point. 

Theorem 16.15 ( Cauchy’s integral formula). Assume f is analytic on an open set D, 
and let y be any circuit which is homotopic to a point in D. Then for any point z in 
D which is not on the graph of y we have 

r dw = /( Z) r _ l _ dw. (6) 

JyW - Z J y W - Z 
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Proof. Define a new function g on D as follows : 



f(w) - /(z) 
w — z 

/'(*) 


if w # z 
if w = z. 


Then g is analytic at each point w # zinZ) and, at the point z itself, ^ is continuous. 
Applying Cauchy’s integral theorem to g we have \ y g = 0 for every circuit y 
homotopic to a point in D. But if z is not on the graph of y we can write 

g . f m - m iw . 

Jr w ~ 2 

which proves (6). 

note. The same proof shows that (6) is also valid if there is a finite subset T of 
D on which / is not analytic, provided that / is continuous on T and z is not in T. 

The integral J y (w — z) -1 dw which appears in (6) plays an important role in 
complex integration theory and is discussed further in the next section. We can 
easily calculate its value for a circular path. 

Example. If y is a positively oriented circular path with center at z and radius r, we can 
write y(6) = z + re te , 0 < 0 < 2n. Then y'{0) — ire 19 = i {y(0) — z }, and we find 

J r w - z Jo y{0) - z Jo 
note. In this case Cauchy’s integral formula (6) takes the form 

2nif(z) = j* — — - dw. 

J y w - z 

Again writing y(9) = z + re u , we can put this in the form 

/(z) = 1 r f (z + rP) d9. (7) 

2^ Jo 

This can be interpreted as a Mean-Value Theorem expressing the value of / at the 
center of a disk as an average of its values at the boundary of the disk. The function 
/ is assumed to be analytic on the closure of the disk, except possibly for a finite 
subset on which it is continuous. 


>• 

J y 


/(w) 


W 


dw - /(z) 


’ 1 
Jv w - 


dw, 



16.10 THE WINDING NUMBER OF A CIRCUIT WITH RESPECT TO A POINT 


Theorem 16.16. Let y be a circuit and let z be a point not on the graph of y. Then 
there is an integer n {depending on y and on z) such that 


f = Inin. 

Jr*' — 2 
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Proof. Suppose y has domain [a, b~\. By Theorem 16.7 we can express the integral 
in (8) as a Riemann integral, 


I* dw = r» y'{t)dt 

Jv w - z J fl y(t) - 2 


y(0 - * 

Define a complex-valued function on the interval [a, b~] by the equation 

dt_ 
z 


m 


= f* /(0 * 

J« y(0 - 


if a < x < b. 


To prove the theorem we must show that F(b) = 2nin for some integer n. Now F 
is continuous on [a, 6] and has a derivative 


F'(x) = 


y'(x) 


y(x) - z 

at each point of continuity of y' . Therefore the function G defined by 

G(t) = e~ F(,) {y{t) - z) if t e [a, 6], 

is also continuous on [a, 6]. Moreover, at each point of continuity of y' we have 

G'(t) = e~ F(t) y'{t) - F'(t)e~ FM {y(t) - z) =0. 


Therefore G’(t) = 0 for each t in [a, 6] except (possibly) for a finite number of 
points. By continuity, G is constant throughout [a, b~]. Hence, G(b) = G(a). In 
other words, we have 

e~ F(b) {y(t>) — z) = y(a) — z. 


Since y(b) = y(a) # z we find 



which implies F(b) = 2nin, where n is an integer. This completes the proof. 


Definition 16.17. Ify is a circuit whose graph does not contain z, then the integer n 
defined by (8) is called the winding number {or index) of y with respect to z, and is 
denoted by n(y, z). Thus, 


n{y, z ) 



dw 


w — z 


note. Cauchy’s integral formula (6) can now be restated in the form 


n{y, z)f(z) 



dw. 


The term “winding number” is used because n(y, z ) gives a. mathematically 
precise way of counting the number of times the point y(f) “winds around” the 
point z as t varies over the interval [a, 6]. For example, if y is a positively oriented 
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circle given by y(6) = z + re' 9 , where 0 < 0 <; 2 n, we have already seen that the 
winding number is 1. This is in accord with the physical interpretation of the 
point y(0) moving once around a circle in the positive direction as 0 varies from 
0 to 2n. If 0 varies over the interval [0, 2nri], the point y(9) moves n times around 
the circle in the positive direction and an easy calculation shows that the winding 
number is n. On the other hand, if 3(6) = z + re~ 19 for 0 < 0 ^ 2nn, then 5(0) 
moves n times around the circle in the opposite direction and the winding number 
is —n. Such a path 3 is said to be negatively oriented. 


16.11 THE UNBOUNDEDNESS OF THE SET OF POINTS WITH WINDING 
NUMBER ZERO 

Let T denote the graph of a circuit y. Since T is a compact set, its complement 
C — T is an open set which, by Theorem 4.44, is a countable union of disjoint 
open regions (the components of C — T). If we consider the components as 
subsets of the extended plane C*, exactly one of these contains the ideal point oo. 
In other words, one and only one of the components of C — T is unbounded. 
The next theorem shows that the winding number n(y, z) is 0 for each z in the 
unbounded component. 


Theorem 16,18. Let y be a circuit with graph T. Divide the set C — T into two 
subsets: 

E = {z: n(y, z ) = 0} and I = {z : n(y, z) # 0}. 

Then both E and I are open. Moreover, E is unbounded and I is bounded. 


Proof. Define a function g on C — T by the formula 


g(z) = n(y, z) = 



dw 


w — z 


By Theorem 7.38, g is continuous on C — T and, since g(z) is always an integer, 
it follows that g is constant on each component of C — T. Therefore both E and 
I are open since each is a union of components of C — T. 

Let U denote the unbounded component of C — T. If we prove that E con- 
tains U this will show that E is unbounded and that I is bounded. Let K be a 
constant such that |y(f)l < K for all t in the domain of y, and let c be a point in 
U such that |c| > K + A(y) where A(y) is the length of y. Then we have 


1 


y(t) - c 


1 


1 


\c\ - |y(OI \c\ - K 


Estimating the integral for n(y, c) by Theorem 16.6 we find 

0 < \g(c)\ <: A(y) < 1. 

|c| - K 
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Since g(c ) is an integer we must have g{c) = 0, so g has the constant value 0 on U. 
Hence E contains the point c, so E contains all of U. 

There is a general theorem, called the Jordan curve theorem, which states that 
if T is a Jordan curve (simple closed curve) described by y, then each of the sets 
E and / in Theorem 16.18 is connected. In other words, a Jordan curve T divides 
C — T into exactly two components E and / having T as their common boundary. 
The set / is called the inner (or interior) region of T, and its points are said to be 
inside T. The set E is called the outer (or exterior) region of T, and its points are 
said to be outside T. 

Although the Jordan curve theorem is intuitively evident and easy to prove for 
certain familiar Jordan curves such as circles, triangles, and rectangles, the proof 
for an arbitrary Jordan curve is by no means simple. (Proofs can be found in 
References 16.3 and 16.5.) 

We shall not need the Jordan curve theorem to prove any of the theorems in 
this chapter. However, the reader should realize that the Jordan curves occurring 
in the ordinary applications of complex integration theory are usually made up of 
a finite number of line segments and circular arcs, and for such examples it is 
usually quite obvious that C — T consists of exactly two components. For points 
z inside such curves the winding number n(y, z) is + 1 or — 1 because y is homo- 
topic in / to some circular path b with center z, so n(y, z) = n(b, z), and n(5, z) is 
+ 1 or — 1 depending on whether the circular path b is positively or negatively 
oriented. For this reason we say that a Jordan circuit y is positively oriented if, 
for some z inside T we have n(y, z) = + 1 , and negatively oriented if n(y, z) = — 1 . 

16.12 ANALYTIC FUNCTIONS DEFINED BY CONTOUR INTEGRALS 
Cauchy’s integral formula, which states that 

n(y, z)/(z) = ~ f J&L dw, 

2ni J y w — z 

has many important consequences. Some of these follow from the next theorem 
which treats integrals of a slightly more general type in which the integrand 
f(w)/(w — z) is replaced by <p(w)/(w — z), where <p is merely continuous and not 
necessarily analytic, and y is any rectifiable path, not necessarily a circuit. 

Theorem 16.19, Let y be a rectifiable path with graph T. Let <p be a complex-valued 
function which is continuous on T, and let f be defined on C — T by the equation 

f{z) = f dw if z ^ T. 

Jy w ~ z 

Then f has the following properties: 

a) For each point a in C — T,f has a power-series representation 

QO 

/oo = J2 c «( z - a )"’ 

n = 0 


( 9 ) 
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where 


c *») dw 

J 7 (h >-a) n+t 


for n = 0, 1, 2, . . . 


b) The series in (a) has a positive radius of convergence >R, where 

R = inf {| w — a\ : w e T}. 

c) The function f has a derivative of every order n on C — F given by 


/°°(z) = n ! f ^ dw ifzi r. 

- z) n+1 


( 10 ) 


( 11 ) 


( 12 ) 


Proof. First we note that the number R defined by (11) is positive because the 
function g(w) = |w — a\ has a minimum on the compact set T, and this minimum 
is not zero since a$T. Thus, R is the distance from a to the nearest point of T. 
(See Fig. 16.6.) 



To prove (a) we begin with the identity 


1 



k 


L <" + 


t k+i 




valid for all t # 1 . We take t = (z — a)/(w — a) where \z — a\ < R and w e T. 
Then 1/(1 — t) = (w — a)/(w — z). Multiplying (13) by <p(w)/(w — a) and 
integrating along y, we find 



k 

= ]C c »( z - a T + 

it = 0 


where c n is given by (10) and E k is given by 



( 14 ) 
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Now we show that E k -> 0 as k -► oo by estimating the integrand in (14). We have 


Z — Q, 
W — Cl 



and 


1 1 1 

= < . 

| w — z | |w — a 4- o, — z| i? — |a — z | 


Let M = max {\(p(w )\ : w e T), and let A(y) denote the length of y. 
gives us 


m%) r\z - <.iy 

R - |a - z| V R ) 


Then (14) 


Since \z — a| < R we find that E k -* 0 as k -*• oo. This proves (a) and (b). 

Applying Theorem 9.23 to (9) we find that /has derivatives of every order on 
the disk B(a; R ) and that / (n, (a) = nlc„. Since a is an arbitrary point of C — T 
this proves (c). 


note. The series in (9) may have a radius of convergence greater than R, in which 
case it may or may not represent / at more distant points. 


16.13 POWER-SERIES EXPANSIONS FOR ANALYTIC FUNCTIONS 


A combination of Cauchy’s integral formula with Theorem 16.19 gives us: 


Theorem 16.20. Assume f is analytic on an open set S in C, and let a be any point 
of S. Then all derivatives f in \a) exist, and f can be represented by the convergent 
power series 


m . td&o - , r . 

n= o nl 



in every disk B(a; R ) whose closure lies in S. Moreover, for every n > 0 we have 



nf f /(w) 

2ni) y {w - a) n+l 




where y is any positively oriented circular path with center at a and radius r < R. 


note. The series in (15) is known as the Taylor expansion of / about a. Equation 
(16) is called Cauchy's integral formula for / (B) (fl). 


Proof Let y be a circuit homotopic to a point in S, and let T be the graph of 
y. Define g on C — T by the equation 

g(z ) = f dw if z$ T. 

JyW - z 

If z e B(a; R), Cauchy’s integral formula tells us that g(z) = 2nin(y, z)f(z). 
Hence, 

n(y, z)f(z) = — I dw if \z — a\ < R. 

2ni J y w — z 
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Now let y(0) = a + re ie , where \z — a\ < r < R and 0 <, 6 <, 2n. Then 
n(y, z) = 1, so by applying Theorem 16.19 to <p{w) = f(w)l(2ni) we find a series 
representation 

OO 

/(Z) = S C n( Z - <*y, 

n = 0 

convergent for |z — a\ < R, where c n = / (n, (a)/«!. Also, part (c) of Theorem 
16.19 gives (16). 

Theorems 16.20 and 9.23 together tell us that a necessary and sufficient con- 
dition for a complex-valued function / to be analytic at a point a is that / be 
representable by a power series in some neighborhood of a. When such a power 
series exists, its radius of convergence is at least as large as the radius of any 
disk B(a) which lies in the region of analyticity of f. Since the circle of convergence 
cannot contain any points in its interior where / fails to be analytic, it follows that 
the radius of convergence is exactly equal to the distance from a to the nearest 
point at which / fails to be analytic. 

This observation gives us a deeper insight concerning power-series expansions 
for real-valued functions of a real variable. For example, let f(x) = 1/(1 + x 2 ) if 
x is real. This function is defined everywhere in R 1 and has derivatives of every 
order at each point in R 1 . Also, it has a power-series expansion about the origin, 
namely, 

— - — - = 1 — x 2 + x 4 — x 6 -I- • • • 

1 + x 2 

However, this representation is valid only in the open interval (—1, 1). From the 
standpoint of real-variable theory, there is nothing in the behavior of / which 
explains this. But when we examine the situation in the complex plane, we see at 
once that the function /(z) = 1/(1 + z 2 ) is analytic everywhere in C except at 
the points z = +/. Therefore the radius of convergence of the power-series 
expansion about 0 must equal 1, the distance from 0 to i and to — i. 


Examples. The following power series expansions are valid for all z in C: 


a > e * = 

m — n 



c) cos z = 


k (2 «)! 


b) sin z = 


^ (-l)"z 2 ” +1 

k ( 2 n + 1 )! ’ 


16.14 CAUCHY’S INEQUALITIES. LIOUVILLE’S THEOREM 


If /is analytic on a closed disk B(a; R), Cauchy’s integral formula (16) shows that 

•(«)/■ " 1 i f( w ) 


r\a) 


= — f 

2ni J y 


JTi dw ’ 


y (w - a)' 

where y is any positively oriented circular path with center a and radius r < R. 
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We can write y(0) — a + re ie , 0 < 0 < 2 n, and put this in the form 

/ (n V) = ~ r f (a + re 19 ) e' 1 * 9 dd. (17) 

This formula expresses the »th derivative at a as a weighted average of the values 

of / on a circle with center at a. The special case n = 0 was obtained earlier in 
Section 16.9. 

Now, let M{r) denote the maximum value of |/| on the graph of y. Estimating 
the integral in (17), we immediately obtain Cauchy’s inequalities: 


l/ (B) (a)| < 


M(r)n ! 
r" 


(n = 0, 1, 2, . . .). 



The next theorem is an easy consequence of the case n = 1 . 

Theorem 16.21 (Liouville's theorem). If f is analytic everywhere on C and bounded 
on C, then f is constant. 


Proof. Suppose |/(z)| ^ M for all z in C. Then Cauchy’s inequality with n = 1 
gives us \ f'(a)\ <, M/r for every r > 0. Letting r -» + oo, we find /'(a) = 0 for 
every a in C and hence, by Theorem 5.23, /is constant. 

note. A function analytic everywhere on C is called an entire function. Examples 
are polynomials, the sine and cosine, and the exponential. Liouville’s theorem 
states that every bounded entire function is constant. 

Liouville’s theorem leads to a simple proof of the Fundamental Theorem of 
Algebra. 


Theorem 16.22 ( Fundamental Theorem of Algebra). Every polynomial of degree 
n > 1 has a zero. 


Proof. Let P(z) — a 0 + a t z + • • + a„z " , where n 1 and a„ ¥= 0. We assume 
that P has no zero and prove that P is constant. Let/(z) = l/P(z). Then /is 
analytic everywhere on C since P is never zero. Also, since 


P(z) = 




+ 


a 


.n- 1 


+ 


+ 



we see that |P(z)| — » +oo as |z| — ► +oo, so /(z) — ► 0 as |z| — *■ +oo. Therefore 
/ is bounded on C so, by Liouville’s theorem, / and hence P is constant. 


16.15 ISOLATION OF THE ZEROS OF AN ANALYTIC FUNCTION 

If/ is analytic at a and iff (a) = 0, the Taylor expansion of / about a has constant 
term zero and hence assumes the follow ing form : 

00 

/(z) = X c »( z ~ «)"• 

n= 1 
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This is valid for each z in some disk B(a). If / is identically zero on this disk [that 
is, if /(z) = 0 for every z in 5(c)], then each c n = 0, since c n = f (n) (a)/n\. If/is 
not identically zero on this neighborhood, there will be a first nonzero coefficient 
c k in the expansion, in which case the point a is said to be a zero of order k. We 
will prove next that there is a neighborhood of a which contains no further zeros 
off This property is described by saying that the zeros of an analytic function are 
isolated. 

Theorem 16.23. Assume that f is analytic on an open set S in C. Suppose f (a) — 0 
for some point a in S and assume that f is not identically zero on any neighborhood 
of a. Then there exists a disk B(a ) in which f has no further zeros. 

Proof. The Taylor expansion about a becomes /(z) = (z — a) k g(z), where k > 1, 

g(z) = c k 4- c k+t (z — a) + • • • , and g(a) = c k # 0. 

Since g is continuous at a, there is a disk B(a) Q Son which g does not vanish. 
Therefore, /(z) # 0 for all z # a in B(a). 

This theorem has several important consequences. For example, we can use 
it to show that a function which is analytic on an open region S cannot be zero 
on any nonempty open subset of S without being identically zero throughout S. 
We recall that an open region is an open connected set. (See Definitions 4.34 
and 4.45.) 

Theorem 16.24. Assume that f is analytic on an open region S in C. Let A denote the 
set of those points z in S for which there exists a disk B(z) on which f is identically 
zero, and let B — S — A. Then one of the two sets A or B is empty and the other 
one is S itself. 

Proof. We have S = A u B, where A and B are disjoint sets. The set A is open 
by its very definition. If we prove that B is also open, it will follow from the 
connectedness of S that at least one of the two sets A or B is empty. 

To prove B is open, let a be a point of B and consider the two possibilities : 
f(a) # 0, fig) = 0. If f(a) # 0, there is a disk B(a) gSon which /does not 
vanish. Each point of this disk must therefore belong to B. Hence, a is an interior 
point of B if f(a) # 0. But, if f(a) = 0, Theorem 16.23 provides us with a disk 
B(a) containing no further zeros of /. This means that B(a) £ B. Hence, in either 
case, a is an interior point of B. Therefore, B is open and one of the two sets A or 
B must be empty. 

16.16 THE IDENTITY THEOREM FOR ANALYTIC FUNCTIONS 

Theorem 16.25. Assume that f is analytic on an open region S in C. Let T be a 
subset of S having an accumulation point a in S. If f(z) = 0 for every z in T, then 
f(z) = 0 for every z in S. 

Proof. There exists an infinite sequence {z„}, whose terms are points of T, such 
that lim,,.,*, z„ = a. By continuity, f(a) = lim„_ 00 /(z ll ) = 0. We will prove 
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next that there is a neighborhood of a on which f is identically zero. Suppose 
there is no such neighborhood. Then Theorem 16.23 tells us that there must be a 
disk B(a) on which /(z) ^ 0 if z # a. But this is impossible, since every disk B(a) 
contains points of T other than a. Therefore there must be a neighborhood of a 
on which / vanishes identically. Hence the set A of Theorem 16.24 cannot be 
empty. Therefore, A — S, and this means /(z) = 0 for every z in S. 

As a corollary we have the following important result, sometimes referred to 
as the identity theorem for analytic functions: 

Theorem 16.26. Let f and g be analytic on an open region S in C. If T is a subset 
of S having an accumulation point a in S, and if f(z) = g(z) for every z in T, then 
f(z) = g(z) for every z in S. 

Proof. Apply Theorem 16.25 to f — g. 

16.17 THE MAXIMUM AND MINIMUM MODULUS OF AN ANALYTIC 
FUNCTION 

The absolute value or modulus |/| of an analytic function / is a real-valued non- 
negative function. The theorems of this section refer to maxima and minima of 

I/I. 

Theorem 16.27 (Local maximum modulus principle). Assume f is analytic and not 
constant on an open region S. Then \f\ has no local maxima in S. That is, every 
disk B(a ; R) in S contains points z such that |/(z)| > \f(a)\. 

Proof. We assume there is a disk B(a; R ) in S in which |/(z)| < \f(a)\ and prove 
that /is constant on S. Consider the concentric disk B(a; r) with 0 < r < R. 
From Cauchy’s integral formula, as expressed in (7), we have 

l/( fl )l ^ ~ f I f(a + re ie )\ d6. (19) 

Now | f(a + re ,e )\ <, \f(a)\ for all 0. We show next that we cannot have strict 
inequality |/(o + re' 9 ) | < \f(a)\ for any 0. Otherwise, by continuity we would 
have \f(a + re‘ 9 )\ < \f{a)\ - e for some e > 0 and all 0 in some subinterval I of 
[0, 2n] of positive length h, say. Let J = [0, 2k] - I. Then J has measure 
2n — h , and (19) gives us 

2n\f(a)\ < j |/(a + rd 9 ) \ d0 + J* |/(a + re ? 9 ) | dO 

^ h{\f(a)\ - e} + (2n - h) |/(a)| = 2n \f(a)\ - he < 2n |/(a)|. 

Thus we get the contradiction \f(a)\ < \f(a)\. This shows that if r < R, we 
cannot have strict inequality \f(a + re i9 )\ < |/(o)| for any 0. Hence |/(z)| =\f{a)\ 
for every z in B(a; R). Therefore |/| is constant on this disk so, by Theorem 5.23, 
f itself is constant on this disk. By the identity theorem, f is constant on S. 
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Theorem 16.28 (Absolute maximum modulus principle). Let T be a compact subset 
of the complex plane C. Assume f is continuous on T and analytic on the interior of 
T. Then the absolute maximum of \f\ on T is attained on dT, the boundary of T. 

Proof Since T is compact, |/| attains its absolute maximum somewhere on T, 
say at a. If a e ST there is nothing to prove. If a e int T, let S be the component 
of int T containing a. Since |/| has a local maximum at a. Theorem 16.27 implies 
that / is constant on S. By continuity, / is constant on dS £ T, so the maximum 
value, |/(a)|, is attained on dS. But 8S £ ST (Why?) so the maximum is attained 
on dT. 

Theorem 16.29 (Minimum modulus principle). Assume f is analytic and not constant 
on an open region S. If\f\ has a local minimum in S at a, then f (a) = 0. 

Proof If f(a) # Owe apply Theorem 16.27 to g = 1 If Then g is analytic in some 
open disk B(a; R ) and \g\ has a local maximum at a. Therefore g and hence /is 
constant on this disk and therefore on S, contradicting the hypothesis. 

16.18 THE OPEN MAPPING THEOREM 

Nonconstant analytic functions are open mappings; that is, they map open sets 
onto open sets. We prove this as an application of the minimum modulus 
principle. 

Theorem 1630 ( Open mapping theorem). If f is analytic and not constant on an 
open region S, then f is open. 

Proof. Let A be any open subset of S. We are to prove that f(A) is open. Take 
any b in f(A) and write b = f(a), where a e A. First we note that a is an isolated 
point of the inverse-image/ - 1 ({*})• (If not, by the identity theorem / would be 
constant on S .) Hence there is some disk B = B(a; r) whose closure B lies in A 
and contains no point of / -1 ({6}) except a. Since /(B) £ f(A) the proof will be 
complete if we show that /(B) contains a disk with center at b. 

Let dB denote the boundary of B, dB = {z:\z- a\ = r}. Then/(5B) is a 
compact set which does not contain b. Hence the number m defined by 

m — inf (|/(z) — b\ : z e dB }, 

is positive. We will show that /(B) contains the disk B{b ; m/2). _To do this, we 
take any w in B(b ; m/2) and show that w = /(z 0 ) for some z 0 in B. 

Let g(z) = /(z) — w if z e B. We will prove that g(z 0 ) = 0 for some z 0 in B. 
Now \g\ is continuous on B and, since B is compact, there is a point z 0 in B at 
which \g\ attains its minimum. Since a e B, this implies 

\d( Z o)\ ^ \9(<*)\ = I /(fl) - w\ = \b - w| < y . 

But if z e dB, we have 

\g(z)\ = |/(z) - b + b-w\> |/(z) -h|-|M>-h|>m-y = |. 



Th. 16.30 


Laurent Expansions 




Hence, z 0 $ dB so z 0 is an interior point of B. In other words, \g\ has a local 
minimum at z 0 . Since g is analytic and not constant on B, the minimum modulus 
principle shows that g(z 0 ) = 0 and the proof is complete. 


16.19 LAURENT EXPANSIONS FOR FUNCTIONS ANALYTIC IN AN 
ANNULUS 


Consider two functions f t and g u both analytic at a point a , with g^a) = 0. Then 
we have power-series expansions 


9 i( z ) = E b «( z ~ «)"» for \z - a\ < r„ 

<1 = 1 

and 

00 

/i( z ) = ^2 c„{z - a)", for \z - a\ < r 2 . (20) 

n = 0 

Let f 2 denote the composite function given by 

Then f 2 is defined and analytic in the region |z — a\ > r x and is represented there 
by the convergent series 



00 


/a(z) = X b„(z - a) ", 

n— 1 


for |z — a\ > r t . 



Now if r x < r 2 , the series in (20) and (21) will have a region of convergence in 
common, namely the set of z for which 


r x < \z — a\ < r 2 . 

In this region, the interior of the annulus A (a; r x , r 2 ), both f x and f 2 are analytic 
and their sum f x + f 2 is given by 

00 00 

/l( Z ) + fl(z) = S C »( z - «)" + b n( Z ~ 

n=0 n - 1 

The sum on the right is written more briefly as 

00 

X c »( z ~ <*)"’ 

n = — oo 

where c_ n = b n for n = 1, 2, . . . A series of this type, consisting of both positive 
and negative powers of z — a, is called a Laurent series . We say it converges if 
both parts converge separately. 

Every convergent Laurent series represents an analytic function in the interior 
of the annulus A(a; r l9 r 2 ). Now we will prove that, conversely, every function 
/which is analytic on an annulus can be represented in the interior of the annulus 
by a convergent Laurent series. 
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Theorem 1631 . Assume that f is analytic on an annulus A(a; r u r 2 ). Then for 
every interior point z of this annulus we have 

m = / t (z) + f 2 (z), (22) 

where 


00 


00 


/i(z) = L c »( z ~ a T and fi( z ) = L c -«( z ~ a ) "• 


n — 0 


n= 1 


(w = 0, + 1, ±2, . . . ), 


(23) 


The coefficients are given by the formulas 

c. . f Wj* 

where y is any positively oriented circular path with center at a and radius r, with 
r t < r < r 2 . The function f t ( called the regular part of f at a) is analytic on the 
disk B(a; r 2 ). The function f 2 ( called the principal part off at a) is analytic outside 
the closure of the disk B(a ; rj. 

Proof Choose an interior point z of the annulus, keep z fixed, and define a function 
g on A(a; r t , r 2 ) as follows: 


fM - f(z) 
g(w) = <! w - z 

f'(z) 


if w # z 


if w = z. 


Then g is analytic at w if w # z and g is continuous at z. Let 


I 


<p{r) = I g(w) dw, 

Vp 

where y r is a positively oriented circular path with center a and radius r, with 
r t ^ r < r 2 . By Theorem 16.8, <p(rj) = <p(r 2 ) so 


{ 


g(w) dw = g(w) dw, 
yi J y2 


(24) 


where = y ri and y 2 = y ri . Since z is not on the graph of y t or of y 2 , in each of 
these integrals we can write 


g(w) = 


f(w) f(z) 


w 


w — z 


Substituting this in (24) and transposing terms, we find 


f(z) 


1 


dw — 


w 


1 . ^ *} = 1 . 


f(w) 


dw 


w 


-j 

J yi 


f(w) 


w — z 


dw. 


(25) 

But f yi (w — z) -1 dw — 0 since the integrand is analytic on the disk B(a; r t ), 
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a °d J y2 (w — z) 1 dw = 2 ni since n(y 2 , z) = 1. Therefore, (25) gives us the 
equation 

/(z) = /i(z) + / 2 (z)» 

where 



and 




m 

w — z 


dw. 


By Theorem 16.19,/! is analytic on the disk B{a ; r 2 ) and hence we have a Taylor 
expansion 


where 


00 

/l(z) = £ c„(z ~ «)" 


n = 0 


for \z — a\ < r 2 , 


c 


n 


± r a»> 

2ni J y2 (w - a)" +1 



Moreover, by Theorem 16.8, the path y 2 can be replaced by y r for any r in the 
interval r 2 <, r <, r 2 . 

To find a series expansion for / 2 (z), we argue as in the proof of Theorem 16.19, 
using the identity (13) with t = (w — a)/(z — a). This gives us 


/w — aV /w — ay +1 /z — o\ 

1 — (w — a)/(z — a) n=o ^z — ay yz — ay \z — wy 



If w is on the graph of y u we have |h» — a\ = r t < |z — a|, so |t| < 1. Now we 
multiply (27) by —f(w)/(z — a), integrate along y u and let k -* oo to obtain 


where 


oo 

/z(z) = ^ b„(z - a)-” 


n= 1 


for |z — a\ > 


b 


n 


± f A»> dw _ 

2 Hi J n (W - a) 1 " 



By Theorem 16.8, the path can be replaced by y r for any r in [r t , r 2 ]. If we take 
the same path y r in both (28) and (26) and if we write c_ B for both formulas 
can be combined into one as indicated in (23). Since z was an arbitrary interior 
point of the annulus, this completes the proof. 


note. Formula (23) shows that a function can have at most one Laurent ex- 
pansion in a given annulus. 


16.20 ISOLATED SINGULARITIES 

A disk B(a ; r) minus its center, that is, the set B(a; r ) — {a}, is called a deleted 
neighborhood of a and is denoted by B'(a; r ) or B'(a). 
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Definition 16.32. A point a is called an isolated singularity of f if 

a) / is analytic on a deleted neighborhood of a, 
and 

b) f is not analytic at a. 

note. / need not be defined at a. 

If a is an isolated singularity of f, there is an annulus A(a; r u r 2 ) on which /is 
analytic. Hence /has a uniquely determined Laurent expansion, say 

00 oo 

/o) = £ c »( z - ay + £ c -»( z - *)-"• (29) 

n—0 n = 1 

Since the inner radius r t can be arbitrarily small, (29) is valid in the deleted 
neighborhood B'(a; r 2 ). The singularity a is classified into one of three types 
(depending on the form of the principal part) as follows: 

If no negative powers appear in (29), that is, if c_„ = 0 for every n = 1,2,..., 
the point a is called a removable singularity. In this case, f(z) -+ c 0 as z -* a and 
the singularity can be removed by defining / at a to have the value /(a) = c 0 . 
(See Example 1 below.) 

If only a finite number of negative powers appear, that is, if c_„ # 0 for some 
n but c_ m = 0 for every m > n, the point a is said to be a pole of order n. In this 
case, the principal part is simply a finite sum, namely. 


z — a (z — a) 2 (z — a) n 

A pole of order 1 is usually called a simple pole. If there is a pole at a, then 
|/(z)| -* oo as z -*■ a. 

Finally, if c_„ 0 for infinitely many values of n, the point a is called an 

essential singularity. In this case, f(z) does not tend to a limit as z -* a. 


Example 1. Removable singularity. Let /(z) = (sin z)/z if z / 0, /( 0) = 0. This func- 
tion is analytic everywhere except at 0. (It is discontinuous at 0, since (sin z)/z -> 1 as 
z -> 0.) The Laurent expansion about 0 has the form 


sin z 
z 




5! 


Since no negative powers of z appear, the point 0 is a removable singularity. If we re- 
define/to have the value 1 at 0, the modified function becomes analytic at 0. 

Example 2. Pole. Let /(z) = (sin z)/z 5 if z # 0. The Laurent expansion about 0 is 


sin z 


= z - 4 


-l z - 2 + ---z 2 + 
3! 5! 7! 


In this case, the point 0 is a pole of order 4. Note that nothing has been said about the 
value of / at 0. 
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Example 3. Essential singularity . Let/(z) = e 1/z if z ^ 0. The point 0 is an essential 
singularity, since 


e 


1/z = 1 + z- 1 + i 2 - J 

2! 


+ 


+ — + 
n\ 


Theorem 16.33. Assume that f is analytic on an open region S in C and define g by 
the equation g(z) = \/f(z) iff(z) 0. Then f has a zero of order k at a point a in 
S if, and only if, g has a pole of order k at a. 


Proof If / has a zero of order k at a, there is a deleted neighborhood B'(a) in 
which/ does not vanish. In the neighborhood B(a ) we have /(z) = (z — a) k h(z), 
where h(z) # 0 if z e B(a). Hence, 1 jh is analytic in B(a) and has an expansion 



b 0 + b,(z - a) + ■ • • , 


Therefore, if z e B'(d), we have 


where b 0 


1 

h(a) 


# 0 . 



1 

(z - a) k h(z ) 


b 0 

(z - a) k 


+ —*3- 

(z - a)*" 1 



and hence a is a pole of order k for g. The converse is similarly proved. 


16.21 THE RESIDUE OF A FUNCTION AT AN ISOLATED SINGULAR POINT 

If a is an isolated singular point of /, there is a deleted neighborhood B'(a) on 
which /has a Laurent expansion, say 

00 oo 

/(z) = S c„(z -<*)"+£ c_„(z - a)~ n . (30) 

n-0 n = 1 

The coefficient c_, which multiplies (z - a) -1 is called the residue of/ at a and 
is denoted by the symbol 


c_! = Res /(z). 

z — a 

Formula (23) tells us that 


/(z) dz = 2ni Res /(z), (31) 

Jy I= « 

if y is any positively oriented circular path with center at a whose graph lies in the 
disk B(a). 

In many cases it is relatively easy to evaluate the residue at a point without 
the use of integration. For example, if a is a simple pole, we can use formula (30) 
to obtain 


Res /(z) = lim (z - a)/(z). 


(32) 
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Similarly, if a is a pole of order 2, it is easy to show that 

Res /(z ) = g'(a), where g(z) = (z - a) 2 f(z). 

z — a 

In cases like this, where the residue can be computed very easily, (31) gives us a 
simple method for evaluating contour integrals around circuits. 

Cauchy was the first to exploit this idea and he developed it into a powerful 
method known as the residue calculus. It is based on the Cauchy residue theorem 
which is a generalization of (31). 


16.22 THE CAUCHY RESIDUE THEOREM 


Theorem 16.34. Let f be analytic on an open region S except for a finite number of 
isolated singularities z x , ... ,z„ in S. Let y be a circuit which is homotopic to a 
point in S, and assume that none of the singularities lies on the graph of y. Then we 
have 

f /(z) dz = 2ni ^ n(y, z k ) Res /(z), (33) 

J y z = 2 * 

where n(y, z k ) is the winding number of y with respect to z k . 


Proof The proof is based on the following formula, where m denotes an integer 
(positive, negative, or zero) : 



(z - z k r dz 


2nin(y, z k ) if m — — 1, 

0 if m & — 1. 



The formula for m = — 1 is just the definition of the winding number «( y, z k ). 
Let [a, 6] denote the domain of y. If m # — 1, let g(t) = {y(0 — z k } m+ 1 for t in 
[a, b\ Then we have 



(z - z k ) m dz 



z*}"Y( 0 dt = 


1 

m + 1 



ff'(0 dt 


= — { g(b ) - g{a)} = 0, 
m + 1 


since g(b) = g(a). This proves (34). 

To prove the residue theorem, let/i denote the principal part of /at the point 
z k . By Theorem 16.31,/ is analytic everywhere in C except at z k . Therefore f — f t 
is analytic in S except at z 2 , — , z„. Similarly, / — / — f 2 is analytic in S except 
at z 3 , . . . , z„ and, by induction, we find that / — £2= x f k is analytic everywhere 
in S. Therefore, by Cauchy’s integral theorem, j, (/ — Z2= i fk) = 0, or 

C n 

f-T. 

.. »-■ 

Now we express / as a Laurent series about z k and integrate this series term by 
term, using (34) and the definition of residue to obtain (33). 
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note. If y is a positively oriented Jordan curve with graph r, then n( y, z k ) = 1 
for each z k inside T, and n(y, z k ) = 0 for each z k outside T. In this case, the 
integral of/along y is 2 ni times the sum of the residues at those singularities lying 
inside r. 

Some of the applications of the Cauchy residue theorem are given in the next 
few sections. 


16.23 COUNTING ZEROS AND POLES IN A REGION 


If/is analytic or has a pole at a, and if /is not identically 0, the Laurent expansion 
about a has the form 


/( z ) = ]C c n( z - af, 
n = m 

where c m # 0. If m > 0 there is a zero at a of order m ; if m < 0 there is a pole 
at a of order — m, and if m = 0 there is neither a zero nor a pole at a. 

note. We also write m(f ; a) for m to emphasize that m depends on both / and a. 


Theorem 16.35. Let f be a function, not identically zero, which is analytic on an 
open region S, except possibly for a finite number of poles. Let y be a circuit which is 
homotopic to a point in S and whose graph contains no zero or pole of f Then we 
have 


_ 1 _ 

2ni 


' f(z) 

y/( Z ) 


dz = n (y> fl M/; 

aeS 



where the sum on the right contains only a finite number of nonzero terms. 


note. If y is a positively oriented Jordan curve with graph T, then n(y, a) = 1 
for each a inside T and (35) is usually written in the form 


2ni 


' A*) 

,/(z) 





where N denotes the number of zeros and P the number of poles of / inside T, 
each counted as often as its order indicates. 


Proof. Suppose that in a deleted neighborhood of a point a we have /(z) = 
( z — where g is analytic at a and g(a) # 0, m being an integer (positive 

or negative). Then there is a deleted neighborhood of a on which we can write 

/'( z ) = m g’(z) 
f(z) z - a g(z) ’ 

the quotient g' jg being analytic at a. This equation tells us that a zero of / of 
order m is a simple pole of f If with residue m. Similarly, a pole of / of order m 
is a simple pole of f If with residue —m. This fact, used in conjunction with 
Cauchy’s residue theorem, yields (35). 
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16.24 EVALUATION OF REAL-VALUED INTEGRALS BY MEANS OF 
RESIDUES 

Cauchy’s residue theorem can sometimes be used to evaluate real-valued Riemann 
integrals. There are several techniques available, depending on the form of the 
integral. We shall describe briefly two of these methods. 

The first method deals with integrals of the form Jo* R(sin 6, cos 6 ) dd, where 
R is a rational function* of two variables. 


Theorem 1636. Let R be a rational function of two variables and let 


m = r 



2 iz 



whenever the expression on the right is finite. Let y denote the positively oriented 
unit circle with center at 0. Then 


m 2n 

JO 


R ( sin 0, cos 0) d0 = 


' M 

J* fz 


dz , 


(37) 


provided that f has no poles on the graph of y. 


Proof. Since y(0) = e ie with 0 < 0 < 2n 9 we have 


/(0) = /y(0). 


y(e) 2 - 1 


= sin 0, 


y(0) 2 + l 


2 iy(0) ' 2 y(0) 

and (37) follows at once from Theorem 16.7. 


= cos 0, 


note. To evaluate the integral on the right of (37), we need only compute the 
residues of the integrand at those poles which lie inside the unit circle. 


Example. Evaluate / = Jo* d0j{a + cos 0), where a is real, \a\ > 1. Applying (37), we 
find 


I = 



dz 

z 2 + 2 az + 1 


The integrand has simple poles at the roots of the equation z 2 + 2 az +1=0. These are 
the points 

z 1 = -a + yja 2 — 1, 
z 2 = —a — yja 2 — 1. 


* A function P defined on C x C by an equation of the form 

P(Z U Z 2 ) = a m. n z ” z 2 

m= 0 n= 0 

is called a polynomial in two variables. The coefficients a mt „ may be real or complex. The 
quotient of two such polynomials is called a rational function of two variables. 
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The corresponding residues R t and R 2 are given by 


Rl = lim ---* ~ 21 

Z-*Z| Z 2 + 2 az + 1 


1 




z 2 


> 


r 2 = lim z Jl — 

z->z 2 z 2 + 2 az + 1 


1 


z 2 — z x 


If a > 1, z x is inside the unit circle, z 2 is outside, and / = ^nl(z l — z 2 ) = 2nly]a* — 1. 
If a < - 1, z 2 is inside, zj is outside, and we get / = — 2nl\la 2 — 1. 


Many improper integrals can be dealt with by means of the following theorem : 


Theorem 16.37 . Let T = {x + iy:y > 0} denote the upper half-plane . Let S be 
an open region in C which contains T and suppose f is analytic on S, except , possibly , 
for a finite number of poles . Suppose further that none of these poles is on the real 
axis . If 


then 


lim 

R-* + Q 0 



/( Re if ) Re 19 dO = 0, 


lim 

R-* + oo 



f(x) dx 


n 

2ni L Res/(z). 


*=1 z = z k 


(38) 

(39) 


where z u ... ,z n are the poles of f which lie in T. 


Proof. Let y be the positively oriented path formed by taking a portion of the real 
axis from — R to R and a semicircle in T having [— /?, R] as its diameter, where R 
is taken large enough to enclose all the poles z u . . . , z„. Then 


2ni Res f(z) = f f(z) dz = f * f(x) dx + i V f{Re ie ) Re i9 dd. 
*=1 2 = Z k J r J_ K Jo 

When R -* + oo, the last integral tends to zero by (38) and we obtain (39). 


note. Equation (38) is automatically satisfied if / is the quotient of two poly- 
nomials, say / = PjQ, provided that the degree of Q exceeds the degree of P by 
at least 2. (See Exercise 16.36.) 


Example. To evaluate dx/( 1 + x\ let /(z) = l/(z 4 + 1). Then P(z) = 1, 

Q(z) = 1 + z 4 , and hence (38) holds. The poles of / are the roots of the equation 
1 + z 4 = 0. These are z u z 2 , z 3 , z 4 , where 

Z k = c (2*-x)« ( / 4 = ls 2 , 3, 4). 

Of these, only z t and z 2 lie in the upper half-plane. The residue at z 1 is 

Res f(z) = lim (z — z x )f(z) = = — — . 

z = zi z-»zi (zj ~ Z 2 )(z i — * ZjXZj — z 4 ) 4/ 
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Similarly, we find Res 2=2j /(z) = (l/4/)e*' /4 . Therefore, 

f" = — GT"" 4 + e Ktl *) = * cos - = ? V2. 

J_oo 1 + * 4 4 / 4 2 


16.25 EVALUATION OF GAUSS’S SUM BY RESIDUE CALCULUS 

The residue theorem is often used to evaluate sums by integration. We illustrate 
with a famous example called Gauss’s sum G(n), defined by the formula 

W-l 

G(rt) = e 2 * lr2/n , (40) 

r = o 

where n ^ 1 . This sum occurs in various parts of the Theory of Numbers. For 
small values of n it can easily be computed from its definition. For example, we 
have 

<7(1) = 1, G( 2) = 0, <7(3) = iV3, <?(4) = 2(1 + i). 

Although each term of the sum has absolute value 1 , the sum itself has absolute 
value 0, \fn, or \fln. In fact, Gauss proved the remarkable formula 

G(n ) = W»( 1 + 0(1 + e~* inl2 ), (41) 

for every n > 1. A number of different proofs of (41) are known. We will deduce 
(41) by considering a more general sum S(a, n) introduced by Dirichlet, 

n— 1 

S(a, n) = e * iar2ln , 

r= 0 

where n and a are positive integers. If a = 2, then 5(2, n) = G(n). Dirichlet 
proved (41) as a corollary of a reciprocity law for S(a, n ) which can be stated as 
follows : 


Theorem 16.38. If the product na is even, we have 


S(a, n ) 



where the bar denotes the complex conjugate. 



note. To deduce Gauss’s formula (41), we take a = 2 in (42), and observe that 
= 1 + e~ Ktnl2 . 


Proof. The proof given here is particularly instructive because it illustrates several 
techniques used in complex analysis. Some minor computational details are left 
as exercises for the reader. 

Let g be the function defined by the equation 

B-l 

g(z) = 2 e nia(z+r)2,n . 


( 43 ) 
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Then g is analytic everywhere, and g(0) = S(g, n). Since na is even we find 

a— 1 

g(z + 1) - g(z) = e *** 2 l«( e 2 *ia 2 _ 1 ) _ e xtazi/ n ( e 2 Kiz _ J) ^ £ 2 Kim^ 

m = 0 

(Exercise 16.41). Now define /by the equation 

/(z) = g(z)l(e 2 ™ - 1). 

Then / is analytic everywhere except for a first-order pole at each integer, and / 
satisfies the equation 


/(z + 1) = /(z) + <p(z), 

where 

a— l 

<p(z ) = e*“ z2/n e 2Kimz . 

m=0 

The function <p is analytic everywhere. 

At z = 0 the residue of /is g(Q)fc2ni) (Exercise 16.41), and hence 


(44) 

(45) 


S(a, n ) = ^(O) = 2ni Res /(z) = | /(z) dz, (46) 

*=° J y 

where y is any positively oriented simple closed path whose graph contains only the 
pole z = 0 in its interior region. We will choose y so that it describes a paral- 
lelogram with vertices A, A + 1,5+ l, B, where 

A = -i - Re* 1/4 and B = ~i + Re* 1 ' 4 , 



Figure 16.7 


-• as shown in Fig. 16.7. Integrating/ along y we have 



In the integral } / we make the change of variable w = z + 1 and then use (44) 

to get 


"B+l 

.M+l 


f(w) dw 


m 

J > 


B [B C B 

f(z + 1) dz = /(z) dz + <j»(z) dz. 

A J A J A 
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Therefore (46) becomes 

S(a, n) = 


rB 


<p(z) dz + 


r 


rB + 1 


/(z) dz - 


f(z ) dz. 


(47) 


B 


Now we show that the integrals along the horizontal segments from A to A + 1 
and from B to B + 1 tend to 0 as R -*■ + oo. To do this we estimate the integrand 
on these segments. We write 


l/(z)l 


|g(z)l 

\e 2iciz - 1| ’ 


and estimate the numerator and denominator separately. 
On the segment joining B to B + 1 we let 


y(t) = t + Re* i/4 , where —}< t < Jr. 



From (43) we find 


n- 1 


lg[y(0]l < E 


r= 0 


[n ia(t + Re*' 14 + r) 2 
exp - 

l n 



where exp z = e z . The expression in braces has real part (Exercise 16.41) 

— n a (V 2tR + R 2 + V 2rR)/n. 

Since |e* + ,y | = e* and exp {—nay/ 2rR/n} < 1, each term in (49) has absolute 

value not exceeding exp { — naR 2 fn} exp {—y/2natRfn}. But < t < so 
we obtain the estimate 

|g[y(0]l ^ n e *^2aRI(2n) g -itaR 2 lii 

For the denominator in (48) we use the triangle inequality in the form 

\e 2 *'* - 1| > | |e 2 * iz | - 1|. 

Since |exp {2niy{t)}\ = exp {—2nR sin (rc/4)} = exp { — \/2nR}, we find 

| e 2«.>(0 _ 1| > J _ 


Therefore on the line segment joining B to B + 1 we have the estimate 


1/(2 )l < 


ng ic^2aRH2n) e ~tcaR 1 ln 

1 - e~' t2nR 



as R —* + oo . 


Here o(l) denotes a function of R which tends to 0 as R + oo. 

A similar argument shows that the integrand tends to 0 on the segment joining 
A to A + 1 as /? -► +oo. Since the length of the path of integration is 1 in each 
case, this shows that the second and third integrals on the right of (47) tend to 0 
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Figure 16.8 


as R +oo. Therefore we can write (47) in the form 


m B 

n) — ^)(z) dz + o(l) as R + oo . (50) 

» A 


To deal with the integral J® <p we apply Cauchy’s theorem, integrating q> around 
the parallelogram with vertices A, B, a, —a, where a = B + i = Re ni,A . (See 
Fig. 16.8.) Since <p is analytic everywhere, its integral around this parallelogram 
is 0, so 



Because of the exponential factor e Kiazlla in (45), an argument similar to that given 
above shows that the integral of <p along each horizontal segment -*-0 as R -* +co. 
Therefore (51) gives us 



<p + o(l) 


and (50) becomes 


as R -* +oo. 


S(a, n ) = 



<p(z) dz + o(l) 


as R — ► + oo, 


where a = Re n,,A . Using (45) we find 



L 


" J ra “ 1 

cp(z) dz = \ e mia2l ' n e 2Kimz dz = ^2 e~" inm2/a I(a, m, n, R), 


m — 0 


— a 


m — 0 


where 


I(a , m , n, R ) 



nm 

2 -f- 

a 


dz. 


Applying Cauchy’s theorem again to the parallelogram with vertices —a, a, 
a — nm/a, —a — nm/a, we find as before that the integrals along the horizontal 
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segments ->0 as R -> + oo, so 


T , m r a ~ ma/a (nia ( , nm\ 2 ) , , _ 

I(a, m, n, R) — I exp / — ( z H \ \ dz + o(l) as R 

J l» \ a ) J 

The change of variable w = V ajn(z + nm/a) puts this into the form 


+ 00. 


I n ra'fajn 

I(a, m, «,/?)= /- _ 

^ ^ * -at^a/n 


dw + ^(1) as R -> +oo. 


Letting R -> + oo in (52), we find 


S(a, n) = ^ e _ * inm2/a /- lim f * 
m = 0 V ^ H-+ + 00 J _ 


RJa/ne ni / 4 


dw. 


R^a!ne ni / A 


By writing T = V a/wR, we see that the last limit is equal to 


lim e* iwZ dw = I. 

'-* + co J - j e mi4 


say, where I is a number independent of a and n. Therefore (53) gives us 


S(a, n) = /- IS(n, a). 


(53) 


(54) 


To evaluate / we take a = 1 and « = 2 in (54). Then S(l, 2) = 1 + i and 
5(2, 1) = 1, so (54) implies / = (1 + /)/V 2, and (54) reduces to (42). 


16.26 APPLICATION OF THE RESIDUE THEOREM TO THE INVERSION 
FORMULA FOR LAPLACE TRANSFORMS 

The following theorem is, in many cases, the easiest method for evaluating the 
limit which appears in the inversion formula for Laplace transforms. (See Exercise 
11.38.) 

Theorem 16.39. Let F be a function analytic everywhere in C except, possibly, for 
a finite number of poles. Suppose there exist three positive constants M, b, c such that 


\F(z ) | < 


whenever I z > b. 


Let a be a positive number such that the vertical line x = a contains no poles of F 
and let Zy, ..., z n denote the poles of F which lie to the left of this line. Then, for 
each real t > 0, we have 

lim I e (a+iu) ‘ F(a + iv) dv = 2n ^ Res {e 2t F(z)}. 

T -* + co J — T k= 1 z = zjc 


( 55 ) 
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Proof. We apply Cauchy’s residue theorem to the positively oriented path F 
shown in Fig. 16.9, where the radius T of the circular part is taken large enough 
to enclose all the poles of F which lie to the left of the line x = a, and also T > b. 
The residue theorem gives us 


Now write 


.1 


e“ F(z ) dz = 2ni £ Res {e z, F(z)} 

r *=i z-z k 


r -r+r + r + r + r 

Jr Ja Jb Jc jd je 



where A 9 B 9 C, D 9 E are the points indicated in Fig. 16.9, and denote these integrals 

by A> A. A. A. A- We will prove that I k -* 0 as T -* + oo when k > 1. 

First, we have 



M f* /2 

< — ' e tT eos 6 TdO < 

T c J« 



Me* 

rjpc 


T arcsin 



Since T arcsin (a/T) -*• a as T -* + co, it follows that I 2 -*• 0 as T -*• + oo. In 
the same way we prove A -* 0 as T -* + oo. 

Next, consider I 3 . We have 


IAI < 




g-rTsin* 



But sin <p > 2(pjn if 0 < cp < n/2, and hence 

I'.l < ~ J" 2 e-w -> 0 as T -» + oo. 

Similarly, we find — > 0 as T — > + 00 . But as T — ► + 00 the righthand side of 
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(56) remains unchanged. Hence lim r _ +00 I t exists and we have 

rr " 

lim I t = lim e< a+iv)t F(a + ft?) i dv = 2 ni £ Res {<?”F(z)}. 

T -* + oo T -♦ + oo J — t k=l 2 = 2|c 

Example. Let F(z) = z/(z x + a 2 ), where a is real. Then F has simple poles at ±ia. 
Since z/(z 2 + a 2 ) = i[l/(z + /a) + l/(z — /a)], we find 

Res {/‘F(z)} = i e lat , Res {^‘F(f )} = i e‘ iat . 

z — ia. z = — la 

Therefore the limit in (55) has the value 2ai cos at. From Exercise 1 1 .38 we see that the 
function /, continuous on (0, + oo), whose Laplace transform is F, is given by f{t) = 
cos at. 

16.27 CONFORMAL MAPPINGS 

An analytic function / will map two line segments, intersecting at a point c, into 
two curves intersecting at /(c). In this section we show that the tangent lines to 
these curves intersect at the same angle as the given line segments if /'(c) # 0. 

This property is geometrically obvious for linear functions. For example, 
suppose /(z) = z 4- b. This represents a translation which moves every line 
parallel to itself, and it is clear that angles are preserved. Another example is 
/(z) = az, where a # 0. If \a\ = 1 , then a = e u and this represents a rotation 
about the origin through an angle a. If \a\ # 1 , then a = Re and / represents 
a rotation composed with a stretching (if R > 1) or a contraction (if R < 1). 
Again, angles are preserved. A general linear function /(z) = az + b with a # 0 
is a composition of these types and hence also preserves angles. 

In the general case, differentiability at c means that we have a linear approx- 
imation near c, say /(z) = /(c) + f'(c)(z — c) + o(z — c), and if /'(c) #0we 
can expect angles to be preserved near c. 

To formalize these ideas, let y x and y 2 be two piecewise smooth paths with 
respective graphs T x and r 2 , intersecting at c. Suppose that y t is one-to-one on 
an interval containing t u and that y 2 is one-to-one on an interval containing t 2 , 
where y^fj) = y 2 (r 2 ) = c. Assume also that yi(f,) # 0 and y 2 (r 2 ) # 0. The 
difference 

arg [yi(f 2 )] - arg [y'/f,)], 

is called the angle from T, to T 2 at c. 

Now assume that /'(c) # 0. Then (by Theorem 1 3.4) there is a disk B(c) on 
which / is one-to-one. Hence the composite functions 

w’i(f) = /[fi(0] and w 2 (0 = /[y 2 (0], 

will be locally one-to-one near t t and t 2 , respectively, and will describe arcs Cj 
and C 2 intersecting at /(c). (See Fig. 16.10.) By the chain rule we have 

"W,) = /Wi(*i) * 0 and w' 2 (t 2 ) = f'icmh) * 0. 
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Therefore, by Theorem 1.48 there exist integers n x and n 2 such that 

arg [w;^)] = arg [/'(c)] + arg [yK/)] + 2 nn u 

arg [w 2 (f 2 )] = arg [/'(c)] + arg [yi(t 2 )] + 2 nn 2 , 

so the angle from C x to C 2 at /(c) is equal to the angle from r x to r 2 at c plus 
an integer multiple of In. For this reason we say that / preserves angles at c. Such 
a function is also said to be conformal at c. 

Angles are not preserved at points where the derivative is zero. For example, 
if/(z) = z 2 , a straight line through the origin making an angle a with the real axis 
is mapped by/ onto a straight line making an angle 2a with the real axis. In general, 
when/'(c) = 0, the Taylor expansion of / assumes the form 

/(z) - /(c) = (z - c)*[o* + a k+1 (z - c) + • • •], 

where k ^ 2. Using this equation, it is easy to see that angles between curves 
intersecting at c are multiplied by a factor k under the mapping/ 

Among the important examples of conformal mappings are the Mobius 
transformations. These are functions / defined as follows : If a, b, c, d are four 
complex numbers such that ad — be ^ 0, we define 

/(z) = , (57) 

cz + d 

whenever cz + d # 0. It is convenient to define / everywhere on the extended 
plane C* by setting /(—d/c) = oo and /( oo) = a/c. (If c = 0, these last two 
equations are to be replaced by the single equation /(oo) = oo.) Now (57) can be 
solved for z in terms of /(z) to get 

z = ~4/~(z) + 

c/(z) - a 

This means that the inverse function / -1 exists and is given by 

/-. (2) = =*i±± , 

cz — a 
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with the understanding that f~ 1 (a/c) = oo and / -1 (oo) = — d/c. Thus we see 
that Mobius transformations are one-to-one mappings of C* onto itself. They are 
also conformal at each finite z ^ — d/c , since 



be — ad 
(cz + d) 2 


± 0 . 


One of the most important properties of these mappings is that they map circles 
onto circles (including straight lines as special cases of circles). The proof of this 
is sketched in Exercise 16.46. Further properties of Mobius transformations are 
also described in the exercises near the end of the chapter. 


EXERCISES 


Complex integration; Cauchy’s integral formulas 


16,1 Let y be a piecewise smooth path with domain [a, b] and graph T. Assume that the 
integral J y / exists. Let S be an open region containing T and let g be a function such that 
g'(z) exists and equals /(z) for each z on T. Prove that 



g' = g(B) - g(A\ 


where A = y(a) and B = y(b). 


In particular, if y is a circuit, then A = B and the integral is 0. Hint. Apply Theorem 7.34 
to each interval of continuity of /. 


16,2 Let y be a positively oriented circular path with center 0 and radius 2. Verify each 
of the following by using one of Cauchy’s integral formulas. 



e ) f dz = 2ni(e — 1). f) f , dz = 2ni(e — 2). 

J y Z(Z - 1) J y Z 2 (Z - 1) 

16.3 Let / = u + iv be analytic on a disk B(a\ R). If 0 < r < R , prove that 

f'(a) = — f u(a + re i0 )e~ i0 dO . 

^ Jo 

16.4 a) Prove the following stronger version of Liouville’s theorem: If f is an entire 

function such that lim^oo |/(z)/z| = 0, then f is a constant . 

b) What can you conclude about an entire function which satisfies an inequality of 
the form |/(z)| < M \z\ c for every complex z, where c > 0? 
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16.5 Assume that /is analytic on B{ 0; R). Let y denote the positively oriented circle 
with center at 0 and radius r, where 0 < r < R. If a is inside y, show that 



If a — Ae ia , show that this reduces to the formula 



dz. 




(r 2 - A 2 )f{re e ) 
r 2 — 2rA cos (a — 6) + A 2 


dO. 


By equating the real parts of this equation we obtain an expression known as Poisson’s 
integral formula. 

16.6 Assume that/is analytic on the closure of the disk JJ(0; 1). If \a\ < 1, show that 

(1 ~ W\ 2 )f(a) = r— . f /O) — dz, 

2 m J y z — a 

where y is the positively oriented unit circle with center at 0. Deduce the inequality 

(1 - |a| 2 ) | /(a) | < ~ r \f{e i6 )\ dQ. 

2” Jo 

16.7 Let f(z) = Z”o 2"z"/3" if \z\ < 3/2, and let g(z) = Y?=o (2 z)“" if |z| > Let 

y be the positively oriented circular path of radius 1 and center 0, and define h(a) for 
|a| 1 as follows: 


Prove that 



dz . 



if M < 1, 

if \a\ > 1. 


Taylor expansions 

16.8 Define / on the disk B(0; 1) by the equation /(z) = £® =0 z". Find the Taylor 
expansion of /about the point a = | and also about the point a = -$. Determine the 
radius of convergence in each case. 

16.9 Assume that /has the Taylor expansion /(z) = £* =0 a(n)z n , valid in B( 0; R). Let 

9(z) = - ^ /( ze 2lt< * /p ). 

P k 0 

Prove that the Taylor expansion of g consists of every pth term in that of/ That is, if 
z e B( 0; R) we have 

00 

g(z) = ^ a(pn)z pn . 

n=0 
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16.10 Assume that /has the Taylor expansion /(z) = X«°=o a n z n , valid in 2?(0; R). Let 
s n (z) = 2Z=o a k zk • If 0 < r < R and if \z\ < r , show that 

iv " +1 - z " +1 


s n (z) = 


1 [ f(w) 

2ni J y w n+1 


dw , 


w — z 


where y is the positively oriented circle with center at 0 and radius r. 

16.11 Given the Taylor expansions f{z) = Y,n=o a n z n an d ff( z ) = £n°=o b n z n , valid for 
|z| < R t and |z| < R 2 , respectively. Prove that if |z| < jRijR 2 we have 

f(w) 


L -f— ;w. 


r* f I/O* + ^‘®)| 2 d0 = J2 \a„\ 2 r 2n . 

Ln Jo n=0 


27T/J y W 

where y is the positively oriented circle of radius R x with center at 0. 

16.12 Assume that /has the Taylor expansion /(z) = a n( z - a) n , valid in B(a; R). 

a) If 0 < r < R, deduce ParsevaVs identity : 

In 

b) Use (a) to deduce the inequality Sn'Lo \a n \ 2 r 2n < M(r) 2 , where M{r) is the 
maximum of |/| on the circle |z — a\ — r. 

c) Use (b) to give another proof of the local maximum modulus principle (Theorem 
16.27). 

16.13 Prove Schwarz's lemma: Let f be analytic on the disk B( 0; 1). Suppose that /(0) = 0 
and |/(z)| < 1 if\z\ < 1. Then 

|/'(0)| < 1 and |/(z)| < |z|, if\z\ < 1. 

//|/'(0)| = 1 or if\f(z 0 )\ = \z 0 \ for at least one z 0 in B'( 0; 1), then 

f(z) = e ia z, where a is real . 


Hint . Apply the maximum-modulus theorem to g, where #(0) = /'( 0) and g(z) = f(z)/z 
if z 0. 


Laurent expansions, singularities, residues 


16.14 Let / and g be analytic on an open region 5. Let y be a Jordan circuit with graph T 
such that both T and its inner region lie within S. Suppose that \g{z)\ < |/(z)| for every 
z on T. 

a) Show that 

± r m±jw d! , ± r rw d! . 

Ini J y f{z) + g{z) Ini J y f(z) 

Hint. Let m = inf {|/(z)| — |^(z)| : z e T}. Then m > 0 and hence 

|/(z) + tg(z) | > m > 0 
for each t in [0, 1 ] and each z on T. Now let 


ifO < t < 1. 


m = J_ f ml) 

2 ni J/ /(z) + tg(z) 

Then <t> is continuous, and hence constant, on [0, 1 ]. Thus, <j>{Q) = ^(1). 
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b) Use (a) to prove that / and / + g have the same number of zeros inside r 
(Roue he's theorem ). 

16.15 Let p be a polynomial of degree n 9 say p(z) = Qq + a±z + • • • + a n z n , where 

a n ^ 0. Take f(z) = a n z n , g(z) = p(z) - f(z) in Rouche’s theorem, and prove that p 
has exactly n zeros in C. 

16.16 Let / be analytic on the closure of the disk J?(0; 1) and suppose \f(z)\ < 1 if 
\z\ = 1. Show that there is one, and only one, point z 0 in J?(0; 1) such that/(z 0 ) = z 0 . 
Hint. Use Rouch6’s theorem. 

16.17 Let p n (z) denote the nth partial sum of the Taylor expansion e z = z n /n\. 
Using Rouch6’s theorem (or otherwise), prove that for every r > 0 there exists an N 
(depending on r) such that n > N implies p n (z) ^ 0 for every z in 2?(0; r). 

16.18 If a > e 9 find the number of zeros of the function f(z) = e z - az n which lie inside 
the circle \z\ = 1. 

16.19 Give an example of a function which has all the following properties, or else explain 
why there is no such function : / is analytic everywhere in C except for a pole of order 
2 at 0 and simple poles at i and -i; f(z) = /(-z) for all z; /( 1) = 1; the function 
g(z) = /(1/z) has a zero of order 2 at z = 0; and Res 2=i /(z) = 2 i. 


16.20 Show that each of the following Laurent expansions is valid in the region indicated: 



1 

(z - 1)(2 - z) 




if 1 < |z| < 2. 



1 

(z - 1)(2 - z) 


-Z 


1 - 2"- 1 



if |z| > 2. 


16.21 For each fixed t in C, define J„(t) to be the coefficient of z" in the Laurent expansion 

e <z-l/ 2 )f /2 _ J n (t)z". 

n= — oo 

Show that for n > 0 we have 


and that /_„(/) 


1 C n 

J n (t) = - cos (f sin 0 — nO) dO 

n Jo 

(— 1)V„(/). Deduce the power series expansion 



y^ (~l ) k (jt) n+2k 
i k\(n + k)\ 


(n > 0). 


The function J n is called the Bessel function of order n. 

16.22 Prove Riemanrfs theorem ; If z 0 is an isolated singularity of f and if \f\ is bounded 
on some deleted neighborhood B'(z 0 ), then z 0 is a removable singularity . Hint. Estimate 
the integrals for the coefficients a„ in the Laurent expansion of / and show that a n = 0 for 
each n < 0. 

16.23 Prove the Casorati-Weierstrass theorem : Assume that z 0 is an essential singularity of 
f and let c be an arbitrary complex number . Then , for every e > 0 and every disk B(z 0 ), 
there exists a point z in B(z 0 ) such that \f(z) - c\ < e. Hint . Assume that the theorem is 
false and arrive at a contradiction by applying Exercise 16.22 to g , where g(z) = 

mm - 4 
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16.24 The point at infinity . A function /is said to be analytic at oo if the function g defined 
by the equation g(z) = /(1/z) is analytic at the origin. Similarly, we say that /has a zero, 
a pole, a removable singularity, or an essential singularity at oo if g has a zero, a pole, etc., 
at 0. Liouville’s theorem states that a function which is analytic everywhere in C* must 
be a constant. Prove that 

a) / is a polynomial if, and only if, the only singularity of /in C* is a pole at oo, 
in which case the order of the pole is equal to the degree of the polynomial. 

b) / is a rational function if, and only if, / has no singularities in C* other than 
poles. 

16.25 Derive the following “short cuts” for computing residues: 

a) If a is a first order pole for/, then 

Res /(z) = lim (z - a)f(z). 

z=a z-*a 

b) If a is a pole of order 2 for /, then 

Res/(z) = g\a), where g(z) = (z - a) 2 f(z). 

z=a 

c) Suppose / and g are both analytic at a, with f(a) & 0 and a a first-order zero for 
g. Show that 

R es /(f) _ /(f0 Rcc /(z) 

*=« ff(z) ’ *=« [g{z)Y [ff'(a)] 3 

d) If / and g are as in (c), except that a is a second-order zero for g, then 

Rec f(z) = 6f'(a)g"(a) - 2f(a)g"'(a) 
z=a g(z) 3 1 \g”(a)] 2 

16.26 Compute the residues at the poles of /if 


a) m = -r —. . 

z z — 1 

b )/(*)= , n2 > 

z(z - i r 

v ft x Sin z 

c) f(z) = 

d) /(z) = - — - — - , 

1 - e* 

z cos z 

e) m - J 2 . 

(where n is a positive integer). 


16.27 If y(a\ r) denotes the positively oriented circle with center at a and radius r, show 
that 
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Evaluate the integrals in Exercises 16.28 through 16.35 by means of residues. 


16.28 

16.29 

16.30 

16.31 

16.32 

16.33 

16.34 


l 

l 

l 


dt 


2 na 


o (a + b cos t) 2 (a 2 - b 2 ) 3/2 


cos It dt 


2nd 4 


1 — 2a cos t + a 2 1 


2n (1 + cos 3/) dt __ n(a 2 — a + 1) 
o 1 — 2a cos t + a 2 1 — a 

2 * sin 2 t dt _ 2 n(a - Va 2 - b 2 ) 


Jo a + b cos t 

0O J 


/. 

I 

I 


-00 x 2 + X + 1 


dx = 


b 2 

InyPi 


00 

00 


-co a + x *) 2 


dx = 


0 ( X 2 + 4) 2 (x 2 + 9) 


3 

3tt V2 
16 

dx = 


7T 


200 


ifO < b < a. 


if a 2 < 1. 


ifO < a < 1. 


ifO < b < a. 


16.35 a) f°° ^ ^ dx = -/sin — . 

Jo 1 + * 5 5/ 5 

Integrate z/(l + z 5 ) around the boundary of the circular sector 
S = {re w :0<r</{, 0 < 0 < 2n/5}, and let R -> oo. 

r 00 x 2w , n I . (2m + 1 \ . . A 

0) Yn dx = / sin I : — rc) » /w, n integers, 0 < m < n. 

Jo 1 + x 2ny \ 2/i / 

16.36 Prove that formula (38) holds if /is the quotient of two polynomials, say / = P/Q, 
where the degree of Q exceeds that of P by 2 or more. 

16.37 Prove that formula (38) holds if/(z) = e imz P(z)IQ(z), where m > 0 and P and Q 
are polynomials such that the degree of Q exceeds that of P by 1 or more. This makes it 
possible to evaluate integrals of the form 


r e-^>dx 

J- co Q(x) 

by the method described in Theorem 16.37. 

16.38 Use the method suggested in Exercise 16.37 to evaluate the following integrals: 


a) 

b) 



sin mx 
x(a 2 + jc 2 ) 


dx 


n 

2 ? 


(1 - e ~ flm ) 


if m > 0, a > 0. 


cos mx 
jc 4 + a 4 


dx 


n 

2a* 


e -ma/VJ sin 



if m > 0, a > 0. 
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16,39 Let w = e 2ni/3 and let y be a positively oriented circle whose graph does not pass 
through 1, w , or w 2 , (The numbers 1, w, w 2 are the cube roots of 1.) Prove that the integral 


m (z+ 1) 

y Z 3 “ 1 


dz 


is equal to 2ni(m + nw)l 3, where m and n are integers. Determine the possible values of 
m and n and describe how they depend on y. 


16.40 Let y be a positively oriented circle with center 0 and radius < In. 
and n is an integer, let 


I(n, a) = 



If a is complex 


Prove that 


7(0, a) = i - a, 1(1, a) = -1, and I(n, a) = 0 if n > 1. 

Calculate I( — n, a) in terms of Bernoulli polynomials when n > 1 (see Exercise 9.38). 

16.41 This exercise requests some of the details of the proof of Theorem 16.38. Let 

g(z) = £ / ifl(z+r)2/n , f(z) = g(z)l(e 2 * tz - 1), 

r=0 

where a and n are positive integers with na even. Prove that: 

a) g(z + 1) - g{z) = e ni °* lln {e 2niz - l)^Z l 0 e 2nimz . 

b) Res z=0 /(z) = g(0)l(2ni). 

c) The real part of i(t + Re nilA + r) 2 is — ( \ 2tR + R 2 + V 2rR). 


One-to-one analytic functions 

16.42 Let S be an open subset of C and assume that / is analytic and one-to-one on S. 
Prove that: 

a) f'(z) t* 0 for each z in S. (Hence / is conformal at each point of S .) 

b) If g is the inverse of /, then g is analytic on f(S) and g'(w) = 1 lf'(g(w)) if 
w Gf(S). 

16.43 Let /: C C be analytic and one-to-one on C. Prove that f(z) = az + b, where 
a # 0. What can you conclude if /is one-to-one on C* and analytic on C* except possibly 
for a finite number of poles? 

16.44 If / and g are Mobius transformations, show that the composition fog is also a 
Mobius transformation. 

16.45 Describe geometrically what happens to a point z when it is carried into f(z) by the 
following special Mobius transformations: 

a) f(z) = z + b (Translation). 

b) f(z) = az, where a > 0 (Stretching or contraction). 

c) f(z) = e ia z, where a is real (Rotation). 

d) f(z) = 1 jz (Inversion). 



Exercises 


479 


16.46 If c 96 0 , we have 

az + b __ 0 be — ad 
cz + d c c(cz + </) 

Hence every Mobius transformation can be expressed as a composition of the special cases 
described in Exercise 16.45. Use this fact to show that Mobius transformations carry 
circles into circles (where straight lines are considered as special cases of circles). 

16.47 a) Show that all Mobius transformations which map the upper half-plane T = 

{x + iy : y > 0} onto the closure of the disk B(0; 1) can be expressed in the 
form/(z) = e ia (z - a)l(z — a), where a is real and a e T. 

b) Show that a and a can always be chosen to map any three given points of the 
real axis onto any three given points on the unit circle. 

16.48 Find all Mobius transformations which map the right half-plane 

S = {x + iy : x > 0} 

onto the closure of 2 ?( 0 ; 1 ). 

16.49 Find all Mobius transformations which map the closure of B(0; 1) onto itself. 

16.50 The fixed points of a Mobius transformation 

f{z) = az - + b {ad -be * 0 ) 
cz + d 

are those points z for which /(z) = z. Let D = (d — a) 2 + 4 be. 

a) Determine all fixed points when c = 0. 

b) If c ^ 0 and D ^ 0, prove that / has exactly 2 fixed points z x and z 2 (both 
finite) and that they satisfy the equation 

— = Re ie — — , where R > 0 and 0 is real. 

/(z) - z 2 z - z 2 

c) If c ^ 0 and D = 0, prove that / has exactly one fixed point z x and that it 
satisfies the equation 

_ — -J = — \ y. c for some C ^ 0 . 

/(Z) - Zj z - z t 

d) Given any Mobius transformation, investigate the successive images of a given 
point w. That is, let 

= /(w), w > 2 = /(Wi), . . . , w n = /(h^j), 

and study the behavior of the sequence {w n }. Consider the special case a , b , c, d 
real, ad — be = 1 . 

MISCELLANEOUS EXERCISES 


16.51 Determine all complex z such that 


z = 


00 n 

EE 


glnikzjn 
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16.52 If /(z) = a n ztt is an entire function such that \f(re ie )\ < Me* for all r > 0, 

where M > 0 and k > 0, prove that 



Afe n/k 

(n/k) nlk 


for n > 1. 


16.53 Assume / is analytic on a deleted neighborhood B'( 0; fl). Prove that lim z ^ 0 f{z) 
exists (possibly infinite) if, and only if, there exists an integer n and a function g , analytic 
on B(0; a\ with g(0) ^ 0, such that /(z) = z n g(z ) in 5'(0; tf). 

16.54 Let />(z) = £Z=o a k z* be a polynomial of degree n with real coefficients satisfying 

a 0 > a x > • • • > — i > a n > 0. 

Prove that p(z ) = 0 implies |z| > 1. Hint. Consider (1 — z)/?(z). 

16.55 A function /, defined on a disk Z?(a; r), is said to have a zero of infinite order at a if, 
for every integer k > 0, there is a function g k , analytic at a , such that/(z) = (z — a) k g k (z ) 
on r). If /has a zero of infinite order at a , prove that / = 0 everywhere in B(a; r). 

16.56 Prove Morera’s theorem: Iff is continuous on an open region S in C and if $ y f = 0 
for every polygonal circuit y in S, then f is analytic on S. 
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is a subset of, 1, 33 

R, set of real numbers, 1 

R + , R~, set of positive (negative) numbers, 2 
{x: x satisfies P}, the set of x which satisfy property P, 3, 32 
(a, b ), [a, b], open (closed) interval with endpoints a and b, 4 
[a, b), (a, b], half-open intervals, 4 
(a, +oo), [a, +oo), (— oo, a), (— oo, a], infinite intervals, 4 
Z + , set of positive integers, 4 
Z, set of all integers (positive, negative, and zero), 4 
Q, set of rational numbers, 6 
max S, min S, largest (smallest) element of S, 8 
sup, inf, supremum, (infimum), 9 
[x], greatest integer <x, 11 
R*, extended real-number system, 14 
C, the set of complex numbers, the complex plane, 16 
C *, extended complex-number system, 24 
A x B, cartesian product of A and B, 33 
F{S), image of S under F, 35 
F: S -*• T, function from S to T, 35 
{F„}, sequence whose nth term is F n , 37 
(J, u, union, 40, 41 
n, intersection, 41 

B — A, the set of points in B but not in A, 41 

f~ 1 (Y), inverse image of Y under /, 44 (Ex. 2.7), 81 

R", n-dimensional Euclidean space, 47 

(xj , . . . , x„), point in R", 47 

||x||, norm or length of a vector, 48 

u k , A:th-unit coordinate vector, 49 

5(a), B( a; r), open n-ball with center a, (radius r), 49 

int S, interior of S, 49, 61 

(a, b), [a, b], n-dimensional open (closed) interval, 50, 52 

S, closure of 5, 53, 62 

S', set of accumulation points of S, 54, 62 
(Af, d), metric space M with metric d, 60 
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d(x,y), distance from x to y in metric space, 60 
B M (a; r ), ball in metric space M, 61 
8S, boundary of a set S, 64 

lim , lim , right- (left-)hand limit, 93 

JC -*c+ x-+c — 

right- (left-)hand limit of / at c, 93 
Cl f (T), oscillation of / on a set T, 98 (Ex. 4.24), 170 
a> f (x), oscillation of / at a point x, 98 (Ex. 4.24), 170 
f'(c), derivative of / at c, 104, 114, 117 

D k f, partial derivative of / with respect to the fcth coordinate, 115 

D rk f, second-order partial derivative, 116 

&[a, 2>], set of all partitions of [a, 6], 128, 141 

Vj, total variation of /, 129 

Ay, length of a rectifiable path /, 134 

S(P, /, a), Riemann-Stieltjes sum, 141 

/ e R(a) on [a, 6], /is Riemann-integrable with respect to a on [a, 6], 141 

f e Ron [a, 6], /is Riemann-integrable on [a, 6], 142 

a / on [a, b\ a is increasing on [a, 6], 150 

U(P,f, a), L(P,f, a), upper (lower) Stieltjes sums, 151 

lim sup, limit superior (upper limit), 184 

lim inf, limit inferior (lower limit), 1 84 

a n = 0(b n ), a n = o(b n ), big oh (little oh) notation, 192 

l.i.m ./„ = / {/„} converges in the mean to/, 232 

n-+oo 

/ e C°°,/has derivatives of every order, 241 
a.e., almost everywhere, 172 

f n /f a.e. on S, sequence {/,} increases on S and converges to/ a.e. on S, 254 
S(I), set of step functions on an interval /, 256 
U(I), set of upper functions on an interval 7, 256 
L(7), set of Lebesgue-integrable functions on an interval 7, 260 
/ + ,/“, positive (negative) part of a function / 261 
A7(7), set of measurable functions on an interval 7, 279 
X s , characteristic function of S, 289 
Lebesgue measure of S, 290 

(/ g), inner product of functions / and g, in L 2 (7), 294, 295 

||/||, L 2 -norm of / 294, 295 

L 2 (7), set of square-integrable functions on 7, 294 

f * g, convolution of / and g, 328 

f'(c; u), directional derivative of f at c in the direction u, 344 

T c , f '(c), total derivative, 347 

V/, gradient vector of /, 348 

m(T), matrix of a linear function T, 350 

Df(c), Jacobian matrix of f at c, 351 

L(x, y), line segment joining x and y, 355 
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det [a y ], determinant of matrix [a tJ ], 367 
/f, Jacobian determinant of f, 368 

f e C', the components of f have continuous first-order partials, 371 
I /(x) dx, multiple integral, 389, 407 

J/ 

c(S), c(S), inner (outer) Jordan content of S, 396 
c(S), Jordan content of S, 396 

J /, contour integral of / along y, 436 

A(a; r u r 2 ), annulus with center a, 438 

n(y, z), winding number of a circuit y with respect to z, 445 

B'{a), B'(a; r), deleted neighborhood of a, 457 

Res /(z), residue of / at a, 459 

z — a 
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Abel, Neils Henrik, (1802-1829), 194, 245, 
248 

Abel, limit theorem, 245 
partial summation formula, 194 
test for convergence of series, 194, 248 
(Ex. 9.13) 

Absolute convergence, of products, 208 
of series, 189 
Absolute value, 13, 18 
Absolutely continuous function, 139 
Accumulation point, 52, 62 
Additive function, 45 (Ex. 2.22) 

Additivity of Lebesgue measure, 291 
Adherent point, 52, 62 
Algebraic number, 45 (Ex. 2.15) 

Almost everywhere , 172, 391 
Analytic function, 434 
Annulus, 438 

Approximation theorem of Weierstrass, 
322 

Arc, 88, 435 

Archimedean property of real numbers, 10 

Arc length, 134 

Arcwise connected set, 88 

Area (content) of a plane region, 396 

Argand, Jean-Robert (1768-1822), 17 

Argument of complex number, 21 

Arithmetic mean, 205 

Arzel&, Cesare (1847-1912), 228, 273 

Arzel&’s theorem, 228, 273 

Associative law, 2, 16 

Axioms for real numbers, 1, 2, 9 

Ball, in a metric space, 61 
in R", 49 

gacjc vpctnrs 4-9 

Bernoulli, James (1654-1705), 251, 338, 
478 

Bernoulli, numbers, 251 (Ex. 9.38) 
periodic functions, 338 (Ex. 11.18) 
polynomials, 251 (Ex. 9.38), 478 (Ex. 
16.40) 


Bernstein, Sergei Natanovic (1880- ), 

242 

Bernstein’s theorem, 242 
Bessel, Friedrich Wilhelm (1784-1846), 
309, 475 

Bessel function, 475 (Ex. 16.21) 

Bessel inequality, 309 
Beta function, 331 
Binary system, 225 
Binomial series, 244 
Bolzano, Bernard (1781-1848), 54, 85 
Bolzano’s theorem, 85 
Bolzano-Weierstrass theorem, 54 
Bonnet, Ossian (1819-1892), 165 
Bonnet’s theorem, 165 
Borel, Emile (1871-1938), 58 
Bound, greatest lower, 9 
least upper, 9 
lower, 8 
uniform, 221 
upper, 8 

Boundary, of a set, 64 
point, 64 

Bounded, away from zero, 130 
convergence, 227, 273 
function, 83 
set, 54, 63 
variation, 128 


Cantor, Georg (1845-1918), 8, 32, 56, 67, 
180, 312 

Cantor intersection theorem, 56 
Cantor-Bendixon theorem, 67 (Ex. 3.25) 
Cantor set, 180 (Ex. 7.32) 

Cardinal number, 38 
Carleson, Lennart, 312 
Cartesian product, 33 
Casorati-Weierstrass theorem, 475 (Ex. 
16.23) 

Cauchy, Augustin-Louis (1789-1857), 14, 
73, 118, 177, 183, 207, 222 
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Cauchy condition, 
for products, 207 
for sequences, 73, 183 
for series, 186 

for uniform convergence, 222, 223 
Cauchy, inequalities, 451 
integral formula, 443 
integral theorem, 439 
principal value, 277 
product, 204 
residue theorem, 460 
sequence, 73 

Cauchy-Riemann equations, 118 
Cauchy-Schwarz inequality, for inner 
products, 294 

for integrals, 177 (Ex. 7.16), 294 
for sums, 14, 27 (Ex. 1.23), 30 (Ex. 1.48) 
Ces&ro, Ernesto (1859-1906), 205, 320 
Ces&ro, sum, 205 

summability of Fourier series, 320 
Chain rule, complex functions, 117 
real functions, 107 
matrix form of, 353 
vector-valued functions, 114 
Change of variables, in a Lebesgue integral, 
262 

in a multiple Lebesgue integral, 421 
in a Riemann integral, 164 
in a Riemann-Stieltjes integral, 144 
Characteristic function, 289 
Circuit, 435 

Closed, ball, 67 (Ex. 3.31) 
curve, 435 
interval, 4, 52 
mapping, 99 (Ex. 4.32) 
region, 90 
set, 53, 62 
Closure of a set, 53 
Commutative law, 2, 16 
Compact set, 59, 63 
Comparison test, 190 
Complement, 41 
Complete metric space, 74 
Complete orthonormal set, 336 (Ex. 11.6) 
Completeness axiom, 9 
Complex number, 15 
Complex plane, 17 
Component, interval, 51 
of a metric space, 87 
of a vector, 47 
Composite function, 37 


Condensation point, 67 (Ex. 3.23) 
Conditional convergent series, 189 
rearrangement of, 197 
Conformal mapping, 471 
Conjugate complex number, 28 (Ex. 1.29) 
Connected, metric space, 86 
set, 86 

Content, 396 
Continuity, 78 
uniform, 90 

Continuously differentiable function, 371 
Contour integral, 436 
Contraction, constant, 92 
fixed-point theorem, 92 
mapping, 92 

Convergence, absolute, 189 
* bounded, 227 
conditional, 189 
in a metric space, 70 
mean, 232 
of a product, 207 
of a sequence, 183 
of a series, 185 
pointwise, 218 
uniform, 221 

Converse of a relation, 36 
Convex set, 66 (Ex. 3.14) 

Convolution integral, 328 
Convolution theorem, for Fourier trans- 
forms, 329 

for Laplace transforms, 342 (Ex. 11.36) 
Coordinate transformation, 417 
Countable additivity, 291 
Countable set, 39 
Covering of a set, 56 
Covering theorem, Heine-Borel, 58 
Lindelof, 57 
Cramer’s rule, 367 
Curve, closed, 435 
Jordan, 435 
piecewise-smooth, 435 
rectifiable, 134 


Daniell, P. J. (1889-1946), 252 
Darboux, Gaston (1842-1917), 152 
Decimals, 11, 12, 27 (Ex. 1.22) 
Dedekind, Richard (1831-1916), 8 
Deleted neighborhood, 457 
De Moivre, Ham (1667-1754), 29 
De Moivre’s theorem, 29 (Ex. 1.44) 
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Dense set, 68 (Ex. 3.32) 

Denumerable set, 39 
Derivative(s), of complex functions, 117 
directional, 344 
partial, 115 

of real-valued functions, 104 
total, 347 

of vector-valued functions, 114 
Derived set, 54, 62 
Determinant, 367 
Difference of two sets, 41 
Differentiation, of integrals, 162, 167 
of sequences, 229 
of series, 230 

Dini, Ulisse (1845-1918), 248, 312, 319 
Dini’s theorem, on Fourier series, 319 
on uniform convergence, 248 (Ex. 9.9) 
Directional derivative, 344 
Dirichlet, Peter Gustav Lejeune (1805- 
1859), 194, 205, 215, 230, 317, 464 
Dirichlet, integrals, 314 
kernel, 317 
product, 205 
series, 215 (Ex. 8.34) 

Dirichlet’s test, for convergence of series, 
194 

for uniform convergence of series, 230 
Disconnected set, 86 
Discontinuity, 93 
Discrete metric space, 61 
Disjoint sets, 41 
collection of, 42 
Disk, 49 

of convergence, 234 
Distance function (metric), 60 
Distributive law, 2, 16 
Divergent, product, 207 
sequence, 183 
series, 185 
Divisor, 4 
greatest common, 5 
Domain (open region), 90 
Domain of a function, 34 
Dominated convergence theorem, 270 
Dot product, 48 
Double, integral, 390, 407 
Double sequence, 199 
Double series, 200 

Du Bois-Reymond, Paul (1831-1889), 312 
Duplication formula for the Gamma 
function, 341 (Ex. 11.31) 


e, irrationality of, 7 
Element of a set, 32 
Empty set, 33 
Equivalence, of paths, 136 
relation, 43 (Ex. 2.2) 

Essential singularity, 458 
Euclidean, metric, 48, 61 
space R", 47 
Euclid’s lemma, 5 

Euler, Leonard (1707-1783), 149, 192, 
209, 365 

Euler’s, constant, 192 
product for t(s), 209 
summation formula, 149 
theorem on homogeneous functions, 365 
(Ex. 12.18) 

Exponential form, of Fourier integral 
theorem, 325 
of Fourier series, 323 
Exponential function, 7, 19 
Extended complex plane, 25 
Extended real-number system, 14 
Extension of a function, 35 
Exterior (or outer region) of a Jordan curve, 
447 

Extremum problems, 375 

Fatou, Pierre (1878-1929), 299 
Fatou’s lemma, 299 (Ex. 10.8) 

Fej6r, Leopold (1880-1959), 179, 312, 320 
FejSr’s theorem, 179 (Ex. 7.23), 320 
Fekete, Michel, 178 
Field, of complex numbers, 116 
of real numbers, 2 
Finite set, 38 

Fischer, Ernst (1875-1954), 297, 311 
Fixed point, of a function, 92 
Fixed-point theorem, 92 
Fourier, Joseph (1758-1830), 306, 309, 
312, 324, 326 
Fourier coefficient, 309 
Fourier integral theorem, 324 
Fourier series, 309 
Fourier transform, 326 
Fubini, Guido (1879-1943), 405, 410, 413 
Fubini’s theorem, 410, 413 
Function, definition of, 34 
Fundamental theorem, of algebra, 15, 451, 
475 (Ex. 16.15) 
of integral calculus, 162 
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Gamma function, continuity of, 282 
definition of, 277 
derivative of, 284, 303 (Ex. 10.29) 
duplication formula for, 341 (Ex. 11.31) 
functional equation for, 278 
series for, 304 (Ex. 10.31) 

Gauss, Karl Friedrich (1777-1855), 17, 
464 

Gaussian sum, 464 

Geometric series, 190, 195 

Gibbs’ phenomenon, 338 (Ex. 11.19) 

Global property, 79 

Goursat, Edouard (1858-1936), 434 

Gradient, 348 

Gram, J0rgen Pedersen (1850-1916), 335 
Gram-Schmidt process, 335 (Ex. 11.3) 
Greatest lower bound, 9 

Hadamard, Jacques (1865-1963), 386 
Hadamard determinant theorem, 386 (Ex. 
13.16) 

Half-open interval, 4 

Hardy, Godfrey Harold (1877-1947), 30, 
206, 217, 251, 312 
Harmonic series, 186 
Heine, Eduard (1821-1881), 58, 91, 312 
Heine-Borel covering theorem, 58 
Heine’s theorem, 91 

Hobson, Ernest William (1856-1933), 
312, 415 

Homeomorphism, 84 
Homogeneous function, 364 (Ex. 12.18) 
Homotopic paths, 440 
Hyperplane, 394 

Identity theorem for analytic functions, 452 
Image, 35 
Imaginary part, 15 
Imaginary unit, 18 
Implicit-function theorem, 374 
Improper Riemann integral, 276 
Increasing function, 94, 150 
Increasing sequence, of functions, 254 
of numbers, 71, 185 

Independent set of functions, 335 (Ex. 1 1 .2) 
Induction principle, 4 
Inductive set, 4 
Inequality, Bessel, 309 
Cauchy-Schwarz, 14, 177 (Ex. 7.16), 294 
Minkowski, 27 (Ex. 1.25) 
triangle, 13, 294 


Infimum, 9 

Infinite, derivative, 108 
product, 206 
series, 185 
set, 38 

Infinity, in C*, 24 
in R*, 14 

Inner Jordan content, 396 
Inner product, 48, 294 
Integers, 4 

Integrable function, Lebesgue, 260, 407 
Riemann, 141, 389 
Integral, equation, 181 
test, 191 
transform, 326 

Integration by parts, 144, 278 
Integrator, 142 

Interior (or inner region) of a Jordan curve, 
447 

Interior, of a set, 49, 61 
Interior point, 49, 61 

Intermediate-value theorem, for continuous 
functions, 85 
for derivatives, 112 
Intersection of sets, 41 
Interval, in R, 4 
in R", 50, 52 
Inverse function, 36 
Inverse-function theorem, 372 
Inverse image, 44 (Ex. 2.7), 81 
Inversion formula, for Fourier transforms, 
327 

for Laplace transforms, 342 (Ex. 11.38), 
468 

Irrational numbers, 7 
Isolated point, 53 
Isolated singularity, 458 
Isolated zero, 452 
Isometry, 84 

Iterated integral, 167, 287 
Iterated limit, 199 
Iterated series, 202 


Jacobi, Carl Gustav Jacob (1804-1851), 
351, 368 * 

Jacobian, determinant, 368 
matrix, 351 

Jordan, Camille (1838—1922), 312, 319, 
396, 435, 447 
Jordan, arc, 435 
content, 396 
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curve, 435 
curve theorem, 447 
theorem on Fourier series, 319 
Jordan-measurable set, 396 
Jump, discontinuity, 93 
of a function, 93 

Kestelman, Hyman, 165, 182 
Kronecker delta, S {) , 385 (Ex. 13.6) 

L 2 - norm, 293, 295 

Lagrange, Joseph Louis (1736-1813), 27, 
30, 380 

Lagrange, identity, 27 (Ex. 1.23), 30 (Ex. 
1.48), 380 
multipliers, 380 

Landau, Edmund (1877-1938), 31 
Laplace, Pierre Simon (1749-1827), 326, 
342, 468 

Laplace transform, 326, 342, 468 
Laurent, Pierre Alphonse (1813-1854), 
455 

Laurent expansion, 455 
Least upper bound, 9 
Lebesgue, Henri (1875-1941), 141, 171, 
260, 270, 273, 290, 292, 312, 391, 405 
bounded convergence theorem, 273 
criterion for Riemann integrability, 171, 
391 

dominated-convergence theorem, 270 
integral of complex functions, 292 
integral of real functions, 260, 407 
measure, 290, 408 

Legendre, Adrien-Marie (1752-1833), 336 
Legendre polynomials, 336 (Ex. 11.7) 
Leibniz, Gottfried Wilhelm (1646-1716), 
121 

Leibniz’ formula, 121 (Ex. 5.6) 

Length of a path, 134 
Levi, Beppo (1875-1961), 265, 267, 268, 
407 

Levi monotone convergence theorem, for 
sequences, 267 
for series, 268 
for step functions, 265 
Limit, inferior, 184 
in a metric space, 71 
superior, 184 
Limit function, 218 
Limit theorem of Abel, 245 
Lindeldf, Ernst (1870-1946), 56 


Lindelof covering theorem, 57 
Linear function, 345 
Linear space, 48 
of functions, 137 (Ex. 6.4) 

Line segment in R n , 88 
Linearly dependent set of functions, 122 
(Ex. 5.9) 

Liouville, Joseph (1809-1882), 451 
Liouville’s theorem, 451 
Lipschitz, Rudolph (1831-1904), 121, 137, 
312, 316 

Lipschitz condition, 121 (Ex. 5.1), 137 (Ex. 
6.2), 316 

Littlewood, John Edensor (1885- ), 

312 

Local extremum, 98 (Ex. 4.25) 

Local property, 79 
Localization theorem, 318 
Logarithm, 23 
Lower bound, 8 
Lower integral, 152 
Lower limit, 184 

Mapping, 35 
Matrix, 350 
product, 351 

Maximum and minimum, 83, 375 
Maximum-modulus principle, 453, 454 
Mean convergence, 232 
Mean-Value Theorem for derivatives, 
of real-valued functions, 110 
of vector-valued functions, 355 
Mean-Value Theorem for integrals, 
multiple integrals, 401 
Riemann integrals, 160, 165 
Riemann-Stieltjes integrals, 160 
Measurable function, 279, 407 
Measurable set, 290, 408 
Measure, of a set, 290, 408 
zero, 169, 290, 391, 405 
Mertens, Franz (1840-1927), 204 
Mertens’ theorem, 204 
Metric, 60 
Metric space, 60 

Minimum-modulus principle, 454 
Minkowski, Hermann (1864-1909), 27 
Minkowski’s inequality, 27 (Ex. 1.25) 
Mobius, Augustus Ferdinand (1790- 
1868), 471 

Mobius transformation, 471 
Modulus of a complex number, 18 
Monotonic function, 94 
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Monotonic sequence, 185 
Multiple integral, 389, 407 
Multiplicative function, 216 (Ex. 8.45) 

Neighborhood, 49 
of infinity, 15, 25 

Niven, Ivan M. (1915- ), 180 (Ex. 

7.33) 

^-measure, 408 
Nonempty set, 1 

Nonmeasurable function, 304 (Ex. 10.37) 
Nonmeasurable set, 304 (Ex. 10.36) 
Nonnegative, 3 

Norm, of a function, 102 (Ex. 4.66) 
of a partition, 141 
of a vector, 48 

O, o, oh notation, 192 
One-to-one function, 36 
Onto, 35 
Operator, 327 
Open, covering, 56, 63 
interval in R, 4 
interval in R", 50 
mapping, 370, 454 
mapping theorem, 371, 454 
set in a metric space, 62 
set in R", 49 
Order, of pole, 458 
of zero, 452 
Ordered w-tuple, 47 
Ordered pair, 33 
Order-preserving function, 38 
Ordinate set, 403 (Ex. 14.11) 

Orientation of a circuit, 447 
Orthogonal system of functions, 306 
Orthonormal set of functions, 306 
Oscillation of a function, 98 (Ex. 4.24), 170 
Outer Jordan content, 396 

Parallelogram law, 17 
Parseval, Mark- Antoine (circa 1776- 
1836), 309, 474 

ParsevaFs formula, 309, 474 (Ex. 16.12) 
Partial derivative, 115 
of higher order, 116 
Partial sum, 185 
Partial summation formula, 194 
Partition of an interval, 128, 141 
Path, 88, 133, 435 

Peano, Giuseppe (1858-1932), 224 


Perfect set, 67 (Ex. 3.25) 

Periodic function, 224, 317 
Pi, 7 r, irrationality of, 180 (Ex. 7.33) 
Piecewise-smooth path, 435 
Point, in a metric space, 60 
in R", 47 

Point wise convergence, 218 
Poisson, Sim6on Denis (1781-1840), 332, 
473 

Poisson, integral formula, 473 (Ex. 16.5) 
summation formula, 332 
Polar coordinates, 20, 418 
Polygonal curve, 89 
Polygonally connected set, 89 
Polynomial, 80 
in two variables, 462 
zeros of, 451, 475 (Ex. 16.15) 

Power series, 234 

Powers of complex numbers, 21, 23 
Prime number, 5 

Prime-number theorem, 175 (Ex. 7.10) 
Principal part, 456 
Projection, 394 

Quadratic form, 378 
Quadric surface, 383 
Quotient, of complex numbers, 16 
of real numbers, 2 

Radius of convergence, 234 
Range of a function, 34 
Ratio test, 193 
Rational function, 81, 462 
Rational number, 6 
Real number, 1 
Real part, 15 

Rearrangement of series, 196 
Reciprocity law for Gauss sums, 464 
Rectifiable path, 134 
Reflexive relation, 43 (Ex. 2.2) 

Region, 89 
Relation, 34 

Removable discontinuity, 93 
Removable singularity, 458 
Residue, 459 
Residue theorem, 460 
Restriction of a function, 35 
Riemann, Georg Friedrich Bernard 
(1826-1866), 17, 142, 153, 192, 209, 
312, 313, 318, 389, 475 
condition, 153 
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integral, 142, 389 
localization theorem, 318 
sphere, 17 

theorem on singularities, 475 (Ex. 16.22) 
zeta function, 192, 209 
Riemann-Lebesgue lemma, 313 
Riesz, Frigyes (1880-1956), 252, 297, 305, 
311 

Riesz-Fischer theorem, 297, 311 
Righthand derivative, 108 
Righthand limit, 93 
Rolle, Michel (1652-1719), 110 
Rolle’s theorem, 110 
Root test, 193 

Roots of complex numbers, 22 
Rouch6, Eug&ne (1832-1910), 475 
Rouch6’s theorem, 475 (Ex. 16.14) 


Saddle point, 377 
Scalar, 48 

Schmidt, Erhard (1876-1959), 335 
Schoenberg, Isaac J., (1903- ), 224 

Schwarz, Hermann Amandus (1 843-1921), 
14, 27, 30, 122, 177, 294 
Schwarzian derivative, 122 (Ex. 5.7) 
Schwarz’s lemma, 474 (Ex. 16.13) 
Second-derivative test for extrema, 378 
Second Mean-Value Theorem for Riemann 
integrals, 165 
Semimetric space, 295 
Separable metric space, 68 (Ex. 3.33) 
Sequence, definition of, 37 
Set algebra, 40 

Similar (equinumerous) sets, 38 
Simple curve, 435 
Simply connected region, 443 
Singularity, 458 
essential, 459 
pole, 458 
removable, 458 

Slobbovian integral, 249 (Ex. 9.17) 
Space-filling curve, 224 
Spherical coordinates, 419 
Square-integrable functions, 294 
Stationary point, 377 
Step function, 148, 406 
Stereographic projection, 17 
Stieltjes, Thomas Jan (1856-1894), 140 
Stieltjes integral, 140 
Stone, Marshall H. (1903- ), 252 

Strictly increasing function, 94 


Subsequence, 38 
Subset, 1, 32 

Substitution theorem for power series, 238 
Sup norm, 102 (Ex. 4.66) 

Supremum, 9 

Symmetric quadratic form, 378 
Symmetric relation, 43 (Ex. 2.2) 


Tannery, Jules (1848-1910), 299 
Tannery’s theorem, 299 (Ex. 10.7) 

Tauber, Alfred (1866-circa 1947), 246 
Tauberian theorem, 246, 251 (Ex. 9.37) 
Taylor, Brook (1685-1731), 113, 241, 
361, 449 

Taylor’s formula with remainder, 113 
for functions of several variables, 361 
Taylor’s series, 241, 449 
Telescoping series, 186 
Theta function, 334 
Tonelli, Leonida (1885-1946), 415 
Tonelli-Hobson test, 415 
Topological, mapping, 84 
property, 84 
Topology, point set, 47 
Total variation, 129, 178 (Ex. 7.20) 
Transformation, 35, 417 
Transitive relation, 43 (Ex. 2.2) 

Triangle inequality, 13, 19, 48, 60, 294 
Trigonometric series, 312 
Two-valued function, 86 


Uncountable set, 39 
Uniform bound, 221 
Uniform continuity, 90 
Uniform convergence, of sequences, 221 
of series, 223 

Uniformly bounded sequence, 201 
Union of sets, 41 
Unique factorization theorem, 6 
Unit coordinate vectors, 49 
Upper bound, 8 
Upper half-plane, 463 
Upper function, 256, 406 
Upper integral, 152 
Upper limit, 184 


Vall6e-Poussin, C. J. de la (1866-1962), 
312 

Value of a function, 34 
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Variation, bounded, 128 
total, 129 
Vector, 47 

Vector-valued function, 77 
Volume, 388, 397 

Well-ordering principle, 25 (Ex. 1.6) 
Weierstrass, Karl (1815-1897), 8, 54, 223, 
322, 475 

approximation theorem, 322 
Af-test, 223 
Winding number, 445 


Wronski, J. M. H. (1778-1853), 122 
Wronskian, 122 (Ex. 5.9) 

Young, William Henry (1863-1942), 252, 
312 

Zero measure, 169, 391, 405 
Zero of an analytic function, 452 
Zero vector, 48 

Zeta function, Euler product for, 209 
integral representation, 278 
series representation, 192 



